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I. Basis of the report 



1. With regard to the elements of the international application:* 
| X | the international application as originally filed 
|"^Tj the description: 



pages 
pages . 
pages 



1-62 



NONE 



_ , as originally filed 
filed with the demand 



NONE 



filed with the letter of 



| xj the claims: 

pages 63-71 

pages NONE 

pages NONE 

pages 



NONE 



, as originally filed 

. , as amended (together with any statement) under Article 19 
__ , filed with the demand 



, filed with the letter of 



| x| the drawings: 



pages 
pages 
pages 



1-56 



NONE 



NONE 



, as originally filed 

, filed with the demand 



| X| the sequence listing part of the description: 

pages 

pages 

pages 



filed with the letter of 



NONE 



, as originally filed 

, filed with the demand 



NONE 



, filed with the letter of . 



2. With regard to the language, all the elements marked above were available or furnished to this Authority in the language in which 
the international application was filed, unless otherwise indicated under this item. 

These elements were available or furnished to this Authority in the following language which is: 

I I the language of a translation furnished for the purposes of international search (under Rule 23.1(b)). 
I I the language of publication of the international application (under Rule 48.3(b)). 

| | the language of the translation furnished for the purposes of international preliminary examination (under Rules 55.2 and/ 
or 55.3). 

3. With regard to any nucleotide and/or amino acid sequence disclosed in the international application, the international 
preliminary examination was carried out on the basis of the sequence listing: 

contained in the international application in printed form. 
fx] filed together with the international application in computer readable form. 
| | furnished subsequently to this Authority in written form. 
| | furnished subsequently to this Authority in computer readable form. 

□ The statement that the subsequently furnished written sequence listing doe? not go beyond the disclosure in the 
international application as filed has been furnished. 

The statement that the information recorded in computer readable form is identical to the writen sequence listing has 
1 — 1 been furnished. 

4 | x| The amendments have resulted in the cancellation of: 

DO the description, pages NQNE 

the claims, Nos. NONE 

fx] the drawings, sheets#*g NONE 



5. |X | This report has been drawn as if (some of) the amendments had not been made, since they have been considered to go 

beyond the disclosure as filed, as indicated in the Supplemental Box (Rule 70.2(c)).** 
* Replacement sheets which have been furnished to the receiving Office in response to an invitation under Article 14 are referred to 
in this report as "originally filed** and are not annexed to this report since they do not contain amendments (Rules 70 J 6 
and 70.17). 

**Any replacement sheet containing such amendments must be referred to under item 1 and annexed to this report. 
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V. Reasoned statement under Article 35(2) with regard to novelty, inventive step or industrial applicability; 
citations and explanations supporting such statement 



1 . statement 

Novelty (N) Claims 1-20, 23-32. 34-43, 45-59 YES 

Claims 21,22,33,44 NO 

Inventive Step (IS) Claims 1-20. 23-32. 34-43. 45-59 YES 

Claims 21. 22. 33, 44 NO 

Industrial Applicability (IA) Claims 1-59 YES 

Claims NONE NO 



2. citations and explanations (Rule 70.7) 

Claims 21, 22, 33 and 44 lack novelty under PCT Article 33(2) as being anticipated by GenBank Accession Nos. U66687, 
D77412, U66674 and R97754, respectively. 

Because U66687 has nucleotide sequence 97.4% identical to base 4064 to 4808 of SEQ ID NO. 3 encoding amino acids 
of SEQ ID NO. 4, D77412 has nucleotide sequence 82.2% identical to base 134 to 408 of SEQ ID NO. 1 encoding the amino 
acids of SEQ ID NO. 2, U66674 contains nucleotide sequence 97.7% identical to base 1946 to 3134 of SEQ ID No. 5 encoding 
the amino acids of SEQ ID NO. 6, and R97754 contains nucleotide sequence 98.2% identical to base 4 to 221 of SEQ ID NO. 
7 encoding the amino acids of SEQ ID NO. 8. Thus claims 21, 22, 33 and 44 are clearly anticipated by GenBank Accession Nos. 
U66687, D77412, U66674 and R97754, respectively. 

Since claims 21, 22, 33 and 44 all lack novelty, so they also lack inventive step. On the other hand, claims 1-20, 
23-32, 34-43 and 45-59 all have inventive step and novelty. 



NEW CITATIONS 
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VIII. Certain observations on the international application 



The following observations on the clarity of the claims, description, and drawings or on the question whether the claims are fully 
supported by the description, are made: 

Claim 44 is objected to under PCT Rule 66.2(aXv) as lacking clarity under PCT Article 6 because the claim is indefinite for the 
following reason(s): SEQ ID NO. 7 represents nucleotide sequence instead of amino acid sequence. 

Claims 53-55 are objected to under PCT Rule 66.2(aXv) as lacking clarity under PCT Article 6 because the claim are indefinite 
for the following reason (s): There is no antecedent basis for the term "said mouse" in claims 53-55. 

The description is objected to under PCT Rule 66.2(aXv) as lacking clarity under PCT Article 5 because it fails to adequately 
enable practice of the claimed invention because: Claims 48-52 are drawn to a host cell comprising the nucleotide sequence of 
SEQ ID No. 1, 3, 5 or 7 and a host animal comprising said nucleotide sequence in vivo. Claims 53-55 are drawn to a host 
animal, such as a transgenic mouse, harboring a homozygous null mutation in its endogenous MOAT gene. It was unpredictable 
at the time of the invention in making a transgenic animal harboring a transgene under the control of a promoter. One skilled 
in the art would not be able to predict the phenotype of the transgenic animal produced. The vector used, the coding sequence, 
the non-coding sequence, the promoter and the integration site of the transgene in the genome of the host cells are all important 
factors in contributing to the resulting phenotypes of a transgenic animal. The specification of the present application fails to 
provide adequate guidance for making a transgenic animal via embryonic stem cells except using mouse embryonic stem cells. 
Thus, it would have required a skilled artisan to engage in undue experimentation to practice the claimed invention. The 
description of the present application neither enables making a transgenic animal harboring any transgene, nor enables making 
a host cell derived from said transgenic animal. 

Claims 48-55 objected to as lacking clarity under PCT Rule 66.2(aXv) because practice of the claimed invention is not adequately 
described in writing, as required under PCT Rule 5.1(aX"i), for the reasons set forth in the immediately preceding paragraph. 
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Supplemental Box 

(To be used when the space in any of th e preceding boxes is not sufficient) 

Continuation of: Boxes I - VIII Sheet 10 



CLASSIFICATION: 

The International Patent Classification (IPC) and/or the National classification are as listed below: 
IPC(7): A01N 63/00, A61K 39/395, C12N 15/00, A01N 61/00, C07H 21/02 and US CI.: 424/93.1, 93.2, 130.1; 
435/320.1, 325; 514/1; 536/23.1; 800/13, 18 



I. BASIS OF REPORT: 

5. (Some) amendments are considered to go beyond the disclosure as filed: 
NONE 
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MRP -Related ABC Transporter 
Encoding Nucleic Acids and Methods of Use Thereof 



Pursuant to 35 U.S.C. §202 (c) it is acknowledged that 
the U.S. Government has certain rights in the invention 
described herein, which was made in part with funds from 
the National Institutes of Health, Grant Numbers, CA63173 
and CA0 692 7. 



FIELD OF THE INVENTION 

The present invention relates to the fields of 
medicine and molecular biology. More specifically, the 
invention provides nucleic acid molecules and proteins 
encoded thereby which are involved in the development of 
resistance to pharmacological and chemotherapeutic agents 
in tumor cells. 

BACKGROUND OF THE INVENTION 

Several publications are referenced in this 
application in parentheses in order to more fully describe 
the state of the art to which this invention pertains. 
The disclosure of each of these publications is 
incorporated by reference herein. 

P-glycoprotein, the product of the MDR1 gene, was the 
first ABC transporter shown to confer resistance to 
cytotoxic agents. Pgp functions as an ATP-dependent 
efflux pump that reduces the intracellular concentration 
of a variety of chemotherapeutic agents by transporting 
them across the plasma membrane (1) . The multidrug 
resistance phenotype associated with overexpression of Pgp 
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is of considerable clinical interest because natural 
product drugs are second only to alkylating agents in 
clinical utility, and many effective chemotherapeutic 
regimens contain more than one natural product agent . 
More recently, we and others have reported transfection 
studies indicating that MRP, another ABC family 
transporter, confers a multidrug resistance phenotype that 
includes many natural product drugs, but is distinct from 
the resistance phenotype associated with Pgp (2-6) . MRP 
shares only limited amino acid identity with Pgp, and this 
is reflected in the different substrate specificities of 
the two transporters. In contrast to Pgp, MRP can 
transport a wide range of anionic organic conjugates, 
including glutathione S-conjugates (7) . In addition to 
Pgp and MRP there may be other transporters that are 
involved in cytotoxic drug resistance. In the case of 
natural product drugs, resistant cell lines have been 
described that display a multidrug resistant phenotype 
associated with a drug accumulation deficit, but do not 
overexpress Pgp or MRP (8) . ABC transporters have also 
been linked to cisplatin resistance, and several lines of 
evidence suggest the possibility that pumps specific for 
organic anions may be involved: 1) decreased cisplatin 
accumulation is consistently observed in cisplatin 
resistant cell lines (9); 2) cisplatin is conjugated to 
glutathione in the cell, and this anionic conjugate is 
toxic in an in vitro biochemical assay (10) ; and 3) 
biochemical studies using membrane vesicle preparations 
have shown that cisplatin resistant cells lines have 
enhanced expression of an ATP-dependent transporter of 
CDDP-glutathione and other glutathione S-conjugates such 
as the cystinyl leukotriene LTC 4 (11, 12). These data thus 
suggest that an organic anion transporter may contribute 
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to cisplatin resistance by exporting CDDP-glutathione . 
■ While MRP is an organic anion transporter, the reported 
drug resistance profile of MRP- transf ected cells does not 
extend to this agent (5, 6), and to date only one 
cisplatin resistant cell line has been reported to 
overexpress MRP (13) . This suggests that organic anion 
transporters other than MRP may contribute to cisplatin 
resistance. Consistent with this possibility, the 
canalicular mult ispecif ic organic anion transporter, 
cMOAT, an MRP-related transporter that functions as the 
major organic anion transporter in liver, has been 
reported to be overexpressed in cisplatin resistant cell 
lines (14, 15) . A more direct link between cMOAT and 
cytotoxic drug resistance is suggested by a recent report 
in which transfection of a cMOAT antisense construct into 
a liver cancer cell line resulted in sensitization to 
cisplatin, daunorubicin and other cytotoxic agents (16) . 

Clearly, a need exists for identifying the essential 
components and mechanisms giving rise to drug resistance 
and the transport of anticancer agents out of the tumor 
cell. The elucidation of these mechanisms may be used to 
advantage for the design of efficacious chemotherapeutic 
agents . 

SUMMARY OF THE INVENTION 

This invention provides novel, biological molecules 
useful for identification, detection, and/or molecular 
characterization of components involved in the acquisition 
of drug resistance in tumor cells. According to one 
aspect of the invention, an isolated nucleic acid molecule 
is provided which includes a sequence encoding a protein 
transporter of a size between about 13 0 0 and 13 5 0 amino 
acids in length. The encoded protein, referred to herein 



3 



WO 99/49735 PCT/US99/06644 

as MOAT-B, comprises a multi- domain structure including a 
tandem repeat of nucleotide binding folds appended 
C- terminal to a hydrophobic domain that contains several 
potential membrane spanning helices. Conserved' Walker A 
and B ATP binding sites are present in each of the 
nucleotide binding folds. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a human MOAT-B protein. In a particularly 
preferred embodiment, the human MOAT-B protein has an 
amino acid sequence the same as Sequence I.D. No. 2. An 
exemplary MOAT-B nucleic acid molecule of the invention 
comprises Sequence I.D. No. 1. 

According to another aspect of the invention, a 
second isolated nucleic acid molecule is provided which 
includes a sequence encoding a transporter between about 
1400 and 1450 amino acids. The encoded protein, referred 
to herein as MOAT-C contains a multi-domain structure 
including a tandem repeat of nucleotide binding folds 
appended C-terminal to a hydrophobic domain that contains 
several potential membrane spanning helices. Conserved 
Walker A and B ATP binding sites are present in each of 
the nucleotide binding folds. While similar in structure 
to MOAT-B described above, MOAT-C contains distinct 
sequence differences . 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a human MOAT-C protein. In a particularly 
preferred embodiment, the human MOAT-C protein has an 
amino acid sequence the same as Sequence I.D. No. 4. An 
exemplary MOAT-C nucleic acid molecule of the invention 
comprises Sequence I.D. No. 3. 

According to yet another aspect of the invention, an 
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isolated nucleic acid molecule is provided which includes 
a sequence encoding a protein of a size between about 1500 
and 1550 amino acids in length. The encoded protein, 
referred to herein as MOAT - D , contains a multidomain 
structure including an N- terminal hydrophobic extension 
which harbors five transmembrane spanning helices. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a MOAT - D protein. In a particularly 
preferred embodiment, the human MOAT-D protein has an 
amino acid sequence the same as Sequence I.D. No. 6. An 
exemplary MOAT-D nucleic acid molecule of the invention 
comprises Sequence I.D. No. 5. 

According to yet another aspect of the invention, an 
isolated nucleic acid molecule is provided which includes 
a sequence encoding a protein of a size between about 1480 
and 153 0 amino acids in length. The encoded protein, 
referred to herein as MOAT-E, contains a multidomain 
structure including an N- terminal hydrophobic extension 
'which harbors several transmembrane spanning helices. 
While similar in structure to MOAT-D described above, 
MOAT-E contains distinct sequence differences. 

In a preferred embodiment of the invention, an 
isolated nucleic acid molecule is provided that includes a 
cDNA encoding a MOAT-E protein. In a particularly 
preferred embodiment, the human MOAT-E protein has an 
amino acid sequence the same as Sequence I.D. No. 8. An 
exemplary MOAT-E nucleic acid molecule of the invention 
comprises Sequence I.D. No. 7. 

According to another aspect of the present invention, 
an isolated nucleic acid molecule is provided, which has a 
sequence selected from the group consisting of: (1) 
Sequence I.D. No. 1; (2) a sequence specifically 
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hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 1 comprising 
nucleic acids encoding amino acids 1-1154 of Sequence ID 
No. 2; (3) a sequence encoding preselected portions of 
Sequence I.D. No. 1 within nucleotides 1-3462, (4) 
Sequence I.D. No. 3; (5) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 3 comprising 
nucleic acids encoding amino acids 1-442 of Sequence ID 
No. 4; (6) a sequence encoding preselected portions of 
Sequence I.D. No. 3 within nucleotides 1-1326, (7) 
Sequence I.D. No. 5; (8) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 5 comprising 
nucleic acids encoding amino acids 1-1036 of Sequence ID 
No. 6; (9) a sequence encoding preselected portions of 
Sequence I.D. No. 5 within nucleotides 1-3108, (1) 
Sequence I.D. No. 7; (2) a sequence specifically 
hybridizing with preselected portions or all of the 
complementary strand of Sequence I.D. No. 7 comprising 
nucleic acids encoding amino acids 1-998 of Sequence ID 
No. 8; (3) a sequence encoding preselected portions of 
Sequence I.D. No. 7 within nucleotides 1-300. 

Such partial sequences are useful as probes to 
identify and isolate homologues of the MOAT genes of the 
invention. Additionally, isolated nucleic acid sequences 
encoding natural allelic variants of the nucleic acids of 
Sequence I.D. Nos . , 1, 3, 5 and 7 are also contemplated to 
be within the scope of the present invention. The term 
natural allelic variants will be defined hereinbelow. 

According to another aspect of the present invention, 
antibodies immunologically specific for the human MOAT 
proteins described hereinabove are provided. 
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In yet another aspect of the invention, host cells 
comprising at least one of the MOAT encoding nucleic acids 
are provided. Such host cells include but are not limited 
to bacterial cells, fungal cells, insect cells, mammalian 
cells, and plant cells. Host cells over expressing one 
or more of the MOAT encoding nucleic acids of the 
invention provide valuable research tools for assessing 
transport of chemotherapeut ic agents out of cells. 
MOAT expressing cells also comprise a biological system 
useful in methods for identifying inhibitors of the MOAT 
transporters . 

Another embodiment of the present invention 
encompasses methods for screening cells expressing MOAT 
encoding nucleic acids for chemotherapy resistance. Such 
methods will provide the clinician with data which 
correlates expression of a particular MOAT genes with a 
particular chemotherapy resistant phenotype . 

Diagnostic methods are also contemplated in the 
present invention. Accordingly, suitable oligonucleotide 
probes are provided which hybridize to the nucleic acids 
of the invention. Such probes may be used to advantage in 
screening biopsy samples for the expression of particular 
MOAT genes. Once a tumor sample has been characterized as 
to the MOAT gene(s) expressed therein, inhibitors 
identified in the cell line screening methods described 
above may be administered to prevent efflux of the 
beneficial chemotherapeut ic agents from cancer cells. 

The methods of the invention may be applied to kits. 
An exemplary kit of the invention comprises MOAT gene 
specific oligonucleotide probes and/or primers, MOAT 
encoding DNA molecules for use as a positive control, 
buffers, and an instruction sheet. A kit for practicing 
the cell line screening method includes frozen cells 
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comprising the MOAT genes of the Invention, suitable 
culture media, buffers and an instruction sheet. 

In a further aspect of the invention, transgenic 
knockout mice are disclosed. Mice will be generated in 
which at least one MOAT gene has been knocked out. Such 
mice will provide a valuable in biological system for 
assessing resistance to chemotherapy in an in vivo tumor 
model . 

Various terms relating to the biological molecules of 
the present invention are used hereinabove and also 
throughout the specification and claims. The terms 
"percent similarity" and "percent identity (identical)" 
are used as set forth in the UW GCG Sequence Analysis 
program (Devereux et al . NAR 12:387-397 (1984)). 

With reference to nucleic acids of the invention, the 
term "isolated nucleic acid" is sometimes used. This 
term, when applied to DNA, refers to a DNA molecule that 
is separated from sequences with which it is immediately 
contiguous (in the 5' and 3' directions) in the naturally 
occurring genome of the organism from which it originates. 
For example, the "isolated nucleic acid" may comprise a 
DNA or cDNA molecule inserted into a vector, such as a 
plasmid or virus vector, or integrated into the genomic 
DNA of a prokaryote or eukaryote . 

With respect to RNA molecules of the invention, the 
term "isolated nucleic acid" primarily refers to an RNA 
molecule encoded by an isolated DNA molecule as defined 
above. Alternatively, the term may refer to an RNA 
molecule that has been sufficiently separated from RNA 
molecules with which it would be associated in its natural 
state (i.e., in cells or tissues), such that it exists in 
a "substantially pure" form (the term "substantially pure" 
is defined below) . 
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With respect to protein, the term "isolated protein" 
or "isolated and purified protein" is sometimes used 
herein. This term refers primarily to a protein produced 
by expression of an isolated nucleic acid molecule of the 
invention. Alternatively, this term may refer to a 
protein which has been sufficiently separated from other 
proteins with which it would naturally be associated, so 
as to exist in "substantially pure" form. 

The term "substantially pure" refers to a preparation 
comprising at least 50-60% by weight the compound of 
interest (e.g., nucleic acid, oligonucleotide, protein, 
etc.). More preferably, the preparation comprises at 
least 75% by weight, and most preferably 90-99% by weight, 
the compound of interest. Purity is measured by methods 
appropriate for the compound of interest (e.g. 
chromatographic methods, agarose or polyacrylamide gel 
electrophoresis, HPLC analysis, and the like) . With 
respect to antibodies of the invention, the term 
"immunologically specific" refers to antibodies that bind 
to one or more epitopes of a protein of interest (e.g., 
MOAT-B, MOAT-C or MOAT - D ) , but which do not substantially 
recognize and bind other molecules in a sample containing 
a mixed population of antigenic biological molecules. 

With respect to nucleic acids and oligonucleotides, 
the term "specifically hybridizing" refers to the 
association between two single- stranded nucleotide 
molecules of sufficiently complementary sequence to permit 
such hybridization under pre -determined conditions 
generally used in the art (sometimes termed "substantially 
complementary") . When used in reference to a double 
stranded nucleic acid, this term is intended to signify 
that the double stranded nucleic acid has been subjected 
to denaturing conditions, as is well known to those of 
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skill in the art. In particular, the term refers to 
hybridization of an oligonucleotide with a substantially 
complementary sequence contained within a single-stranded 
' DNA or RNA molecule of the invention, to the substantial 
exclusion of hybridization of the oligonucleotide with 
single-stranded nucleic acids of non- complementary 
sequence . 

One common formula for calculating the stringency 
conditions required to achieve hybridization between 
nucleic acid molecules of a specified sequence homology 
(Sambrook et al . , 1989): 

T m = 81.5°C + 16.6Log [Na + ] + 0,41 (% G+C) -0.63 (% formamide) - 
600/#bp in duplex 

As an illustration of the above formula, using [Na+] 
= [0.368] and 50% formamide, with GC content of 42% and an 
average probe size of 200 bases, the T m is 57°C. The T m of 
a DNA duplex decreases by 1 - 1.5°C with every 1% decrease 
in homology. Thus, targets with greater than about 75% 
sequence identity would be observed using a hybridization 
temperature of 42 °C. Such sequences would be considered 
substantially homologous to the nucleic acid sequences of 
the invent ion . 

The nucleic acids, proteins, antibodies, cell lines, 
methods, and kits of the present invention may be used to 
advantage to identify targets for the development of novel 
agents which inhibit the aberrant transport of cytoxic 
agents out of tumor cells. The transgenic mice of the 
invention may be used an in vivo model for chemotherapy, 
resistance . 

The human MOAT molecules methods and kits described 
above may also be used as research tools and will 
facilitate the elucidation of the mechanism by which tumor 
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BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 shows the predicted structure of MOAT-B and 
comparison with human MRP. The vertical lines indicate 
identical amino acids and the vertical dots indicate 
conserved amino acids. Gaps are indicated by periods. 
The overbars indicate potential transmembrane spanning 
segments as predicted by the TMAP program. The first and 
second nucleotide binding folds (NBF 1 and NBF 2) are 
indicated by horizontal arrows. The C-terminal 34 amino 
acids (residues 1291 - 1325) are replaced in the second 
class of MOAT-B cDNA clones by the following amino acids: 
ILQKKLSTYWSH. The Alignment was performed using the GAP 
program (gap weight 3.0, length weight 0.1) in the 
Genetics Computer Group Package. H. MRP: human MRP. 

Figures 2A and 2B depict a comparison of the 
nucleotide binding folds and hydropathy profile of MOAT-B 
with those of other eukaryotic ABC transporters. Fig. 1A 
shows the comparison of the nucleotide binding folds of 
MOAT-B. Amino acids that are identical to those of MOAT-B 
are shaded, and gaps are indicated by periods. Walker A 
and B motifs, and the ABC transporter family signature 
sequence C, are underlined. Amino acid positions are 
indicated to the right. Amino acid sequences were aligned 
using the PILEUP program (gap weight 3.0, length weight 
0.1) in the Genetics Computer Group Package. Fig. IB 
shows a comparison of the MOAT-B hydropathy profile. To 
facilitate comparison, the proteins are aligned so that 
the N-terminal nucleotide binding folds (NBF) are roughly 
in register. NBF * s are indicated by bars. Values above 
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and below the horizontal lines indicate hydrophobic and 
hydrophilic regions, respectively. Hydrophobicity plots 
were generated using the Kyte-Doolittle algorithm with a 
window of 7 residues. The transporters shown are: human 
multidrug-associated protein, H . MRP (P33529) ; human 
multispecif ic organic anion transporter, H. MOAT (U63 970) ; 
Saccharomyces cerevisiae yeast cadmium factor 1, S. YCF1 
(P39109) ; rat sulfonylurea receptor, R. SUR (Q09427) ; 
human cystic fibrosis transmembrane conductance regulator, 
H. CFTR (M28668) ; Leishmania P-glycoprotein, L. PgpA 
(P21441) and human mdrl gene product, H. MDR1 (P08183). 
Accession numbers are shown in parentheses. 

Figure 3 is a Northern blot showing the tissue 
distribution of MOAT - B transcript. Membranes containing 
poly (A) + RNA prepared from human tissues were hybridized 
with a radiolabeled MOAT - B or GAPDH probe. Top panels 
show MOAT-B transcript and bottom panels show the control 
GAPDH transcript. Arrows indicate the position of MOAT-B 
transcript. Prolonged exposure of the film revealed a low 
level signal in liver. 

Figure 4 shows the chromosomal localization of the 
gene encoding MOAT-B. Human metaphase spreads were 
hybridized with a biot in- labeled MOAT-B cDNA probe and 
detected by FITC-conjugated avidin. Hybridization signals 
at chromosome 13q32 in two metaphase spreads are indicated 
by arrows. The inset shows paired hybridization signals 
at band q32 of chromosome 13 from three other metaphase 
spreads . 

Figures 5A and 5B show the predicted structures of 
MOAT-C and MOAT - D . Fig. 5A presents the structure of 
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MOAT-C. Fig. 5B shows the structure of MOAT - D . Numbered 
overbars indicate potential transmembrane spanning 
helices. Horizontal arrows indicate the positions of the 
amino terminal (NBF1 ) and C- terminal (NBF2) nucleotide 
binding folds. Walker A and B motifs, and the ABC 
transporter family signature sequence C are underlined. 
Bullets indicate the positions of potential N-linked 
glycosylation sites that are conserved with previously, 
reported N-glycosylat ion sites in MRP. The indicated 
MOAT-C transmembrane spanning helices were predicted using 
the TMAP program and an input alignment of MOAT-B and 
MOAT-C. The indicated MOAT-D transmembrane helices are 
based upon inspection of an alignment with MRP. 

Figures 6A and 6B show a comparison of the nucleotide 
binding folds and hydropathy profiles of MOAT-C and MOAT-D 
with those of other related ABC transporters. Fig. 6A 
depicts the comparison of the nucleotide binding folds. 
The alignment was produced using the PILEUP command (gap 
weight 3.0, length weight 0.1) in the Genetics Computer 
Group Package Version 9.1. Amino acid positions conserved 
in at least 4 of the 8 proteins are shaded. Periods 
indicate gaps in the alignment. Walker A and B, and the 
ABC transporter family signature sequence C are indicated 
by underbars. Fig. 6A shows the comparison of hydropathy 
profiles. To facilitate comparisons, gaps were introduced 
at the N-termini of some proteins in order to bring the 
first nucleotide binding folds into register. Nucleotide 
binding folds are indicated by bars. Values above and 
below the horizontal lines indicate hydrophobic and 
hydrophilic regions, respectively. Hydrophobicity plots 
were generated using the Kyte-Doolittle algorithm with a 
window of 7 residues. Accession numbers are as follows: 
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MRP, P33529; cMOAT , U63 97 0, SUR, Q09428; CFTR, P- 13 56 9; 
MDR1 , P08183. 



Figure 7 is a Northern blot showing the tissue 
distribution of MOAT-C and MOAT-D transcripts. Blots 
containing poly A+ RNA prepared from various human tissues 
were hybridized with MOAT-C, MOAT-D and actin probes. 
Arrows indicate the position of the MOAT-C (top panel) and 
MOAT-D (middle panel) transcripts. The bottom panel shows 
the control actin transcript. 

Figures 8A and 8B show the chromosomal localization 
of the MOAT-C and MOAT-D genes. Human metaphase spreads 
were hybridized with a biotin-labeled MOAT-C and MOAT-D 
cDNA probes and detected by FITC- conj ugated avidin. Fig. 
8A shows the localization of MOAT-C. Hybridization 
signals at chromosome 3q2 7 in two metaphase spreads are 
indicated by arrows (top) . The inset shows paired 
hybridization signals at band q27 of chromosome 3 from 
three other metaphase spreads. Fig. 8B shows the 
localization of MOAT-D. Hybridization signals at 
chromosome 17q21-22 in two metaphase spreads are indicated 
by arrows (top) . The inset shows paired hybridization 
signals at band q21-22 of chromosome 17 from three other 
metaphase spreads . 

Figure 9 shows predicted amino acid sequence of MOAT- 
E. Also shown are the location of the potential 
transmembrane helices (overbars) , the potential N- 
glycosylation site (black dot) and the two nucleotide 
binding folds (NBF1 and NBF2) . Walker A and B motifs, as 
well as the signature C motif of ABC transporters, are 
also indicated. 
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Figure 10 shows a comparison of the hydropathy 
profile of MOAT-E with other members of the MRP-cMOAT 
subfamily. The profile reveals that MOAT-E has a 
hydrophobic N- terminal segment which is absent in MOAT-B 
and MOAT-C. 

Figure 11 is a RNA blot which reveals that MOAT-E is 
expressed only in the liver and the kidney, suggesting 
that MOAT-E may participate in the excretion of substances 
into urine and bile. The lower panel shows hybridization 
of an actin probe to assess RNA loading. 

Figures 12A-12J show the cDNA (SEQ ID NO: 1) and 
amino acid sequences (SEQ ID NO: 2) encoded by MOATB . 

Figures 13A-13K show the cDNA (SEQ ID NO: 3) and 
amino acid sequences (SEQ ID NO: 4) encoded by MOATC . 

Figures 14A-14K show the cDNA (SEQ ID NO: 5) and 
amino acid sequences (SEQ ID NO: 6) encoded by MOATD. 

Figures 15A-15K show the cDNA (SEQ ID NO: 7) and 
amino acid sequences (SEQ ID NO: 8) encoded by MOATE . 

DETAILED DESCRIPTION OF THE INVENTION 

MRP and cMOAT are closely related mammalian ABC 
transporters that export organic anions from cells. 
Transfection studies have established that MRP confers 
resistance to natural product cytotoxic agents, and recent 
evidence suggests the possibility that cMOAT may 
contribute to cytotoxic drug resistance as well. Based 
upon the potential importance of these transporters in 
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clinical drug resistance, and their important 
physiological roles in the export of the amphiphilic 
products of phase I and phase II metabolism, we sought to 
identify other MRP-related transporters. Using a 
degenerate PCR approach, a cDNA molecule was isolated 
which encodes a novel ABC transporter designated herein as 
MOAT - B . The MOAT - B gene was mapped using fluorescence in 
situ hybridization to chromosome band 13q32 . Comparison 
of the MOAT-B predicted protein with other transporters 
revealed that it is most closely related to MRP, cMOAT, 
and the yeast organic anion transporter YCFl . While 
MOAT-B is closely related to these transporters, it is 
distinguished by the absence of approximately 20 0 amino 
acid N-terminal hydrophobic extension that is present in 
MRP and cMOAT, and which is predicted to encode several 
transmembrane spanning segments. In addition, the MOAT-B 
tissue distribution is distinct from MRP and cMOAT. In 
contrast to MRP, which is widely expressed in most 
tissues, including liver, and cMOAT, whose expression is 
largely restricted to liver, the MOAT-B transcript is 
widely expressed, with particularly high levels in 
prostate, but is barely detectable in liver. These data 
indicate that MOAT-B is a ubiquitously expressed 
transporter that is closely related to MRP and cMOAT, and 
indicate that it is an organic anion pump relevant to 
cellular detoxification. 

Three additional MRP/ cMOAT- related transporters, 
MOAT-C, MOAT - D and MOAT-E are also disclosed herein. 
MOAT-C encodes a 143 7 amino acid protein that is most 
closely related to MRP, cMOAT and MOAT-B , among eukaryotic 
transporters (33% - 3 7% identity) . However, based upon 
amino acid identity, MOAT-C is considerably less related 
to MRP and cMOAT than the latter transporters are to each 
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other (48% identity) . In addition, the MOAT-C topology is 
distinct from that of MRP and cMOAT in that it, like 
MOAT - B , lacks an N~ terminal transmembrane spanning domain. 
MOAT-D encodes a 153 0 amino acid transporter that is 
highly related to MRP (57% identity) and cMOAT (47% 
identity) . MOAT-E encodes 1503 amino acid transporter 
that is highly related to MOAT-D, MRP and cMOAT (39-45% 
identity) . The topology of MOAT-D and MOAT-E are quite 
similar to MRP and cMOAT, in that they have an N-terminal 
hydrophobic extension that is predicted to harbor five 
transmembrane spanning helices. MOAT-C and MOAT-D were 
mapped to chromosome bands 3q2 7 and 17q21-22, 
respectively, by fluorescence in situ hybridization. 

The expression patterns of MOAT-C, MOAT-D and MOAT-E 
are distinct from those of MRP, cMOAT and MOAT-B. MOAT-C 
transcript is widely expressed, with highest levels in 
skeletal muscle, kidney and testis, but is expressed at 
barely detectable levels in liver and lung. MOAT-D 
transcript has a more restricted expression pattern, with 
high levels in colon, pancreas, liver and kidney. Data 
presented herein reveal that MOAT-E expression is 
restricted to liver and kidney. 

Based upon degree of amino acid identity, and protein 
topology, the MRP-related transporters fall into two 
groups, with the first group consisting of MRP, cMOAT, 
MOAT-D and MOAT-E, and the second group consisting of 
MOAT-B and MOAT-C. The isolation of MOAT-C, MOAT-D and 
MOAT-E thus helps to define the MRP/cMOAT subfamily. The 
high degree of amino acid identity and topological 
similarity of MOAT-D and MOAT-E to MRP and cMOAT suggest 
that they function as organic anion transporters, and play 
a role in cytotoxic drug resistance. In contrast, the 
lower degree of amino acid identify and distinct topology 
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of MOAT - B and MOAT-C suggest the possibility that their 
substrate specificities and functions may be distinct from 
that, of MRP, cMOAT, MOAT - D and MOAT - E . 

.The compositions, methods, kits and transgenic mice 
of the invention. disclosed herein will facilitate the 
identification of drugs that cripple the ability of MOAT 
genes and proteins encoded thereby to effect the efflux of 
clinically beneficial pharmacological agents in malignant 
cells . 



I. Preparation of MOAT -Encoding Nucleic Acid Molecules, 
MOAT Proteins, and Antibodies Thereto 
A. Nucleic Acid Molecules 

Nucleic acid molecules encoding the MOAT proteins of 
the invention may be prepared by two general methods: (1) 
synthesis from appropriate nucleotide triphosphates, or 
(2) isolation from biological sources. Both methods 
utilize protocols well known in the art. The availability 
of nucleotide sequence information, such as cDNAs having 
Sequence I.D. Nos . 1, 3, 5, or 7 enables preparation of an 
isolated nucleic acid molecule of the invention by 
oligonucleotide synthesis. Synthetic oligonucleotides may 
be prepared by the phosphoramidite method employed in the 
Applied Biosystems 38A DNA Synthesizer or similar devices. 
The resultant construct may be purified according to 
methods known in the art, such as high performance liquid 
chromatography (HPLC) . Long, double- stranded 
polynucleotides, such as a DNA molecule of the present 
invention, must be synthesized in stages, due to the size 
limitations inherent in current oligonucleotide synthetic 
methods. Thus, for example, a 5 kb double -stranded 
molecule may be synthesized as several smaller segments of 
appropriate complementarity. Complementary segments thus 
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produced may be annealed such that each segment possesses 
appropriate cohesive termini for attachment of an adjacent 
segment. Adjacent segments may be ligated by annealing 
cohesive termini in the presence of DNA ligase to 
construct an entire 5 kb double-stranded molecule. A 
synthetic DNA molecule so constructed may then be cloned 
and amplified in an appropriate vector. 

Nucleic acid sequences encoding the MOAT proteins of 
the invention may be isolated from appropriate biological 
sources using methods known in the art. In a preferred 
embodiment, a cDNA clone is isolated from a cDNA 
expression library of human origin. In an alternative 
embodiment, utilizing the sequence information provided by 
the cDNA sequence, human genomic clones encoding MOAT 
proteins may be isolated. Alternatively, cDNA or genomic 
clones having homology with MOAT - B , MOAT-C, MOAT-D or 
MOAT-E may be isolated from other species using 
oligonucleotide probes corresponding to predetermined 
sequences within the MOAT encoding nucleic acids. 

In accordance with the present invention, nucleic 
acids having the appropriate level of sequence homology 
with the protein coding region of Sequence I.D. Nos . 1, 3, 
5, and 7 may be identified by using hybridization and 
washing conditions of appropriate stringency. For 
example, hybridizations may be performed, according to the 
method of Sambrook et al . , (supra) using a hybridization 
solution comprising: 5X SSC, 5X Denhardt ' s reagent, 1.0% 
SDS, 10 0 /ig/ml denatured, fragmented salmon sperm DNA, 
0.05% sodium pyrophosphate and up to 50% f ormamide . 
Hybridization is carried out at 37-42°C for at least six 
hours. Following hybridization, filters are washed as 
follows: (1) 5 minutes at room temperature in 2X SSC and 
1% SDS; (2) 15 minutes at room temperature in 2X SSC and 
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0.1% SDS; (3) 30 minutes-1 hour at 37°C in IX SSC and 1% 
SDS; (4) 2 hours at 42-65°in IX SSC and 1% SDS, changing 
the solution every 30 minutes. 

Nucleic acids of the present invention may be 
maintained as DNA in any convenient cloning vector. In a 
preferred embodiment, clones are maintained in a plasmid 
cloning/expression vector, such as pBluescript 
(Stratagene, La Jolla, CA) , which is propagated in a 
suitable E. coli host cell. 

MOAT-encoding nucleic acid molecules of the invention 
include cDNA, genomic DNA, RNA, and fragments thereof 
which may be single- or double - stranded . Thus, this 
invention provides oligonucleotides (sense or antisense 
strands of DNA or RNA) having sequences capable of 
hybridizing with at least one sequence of a nucleic acid 
molecule of the present invention, such as selected 
segments of the cDNA having Sequence I.D. No. 1. Such 
oligonucleotides are useful as probes for detecting or 
isolating MOAT genes. Antisense nucleic acid molecules 
may be targeted to translation initiation sites and/or 
splice sites to inhibit the translation of the 
MOAT-encoding nucleic acids of the invention. Such 
antisense molecules are typically between 15 and 30 
nucleotides and length and often span the translational 
start site of MOAT encoding mRNA molecules. 

It will be appreciated by persons skilled in the art 
that variants of these sequences exist in the human 
population, and must be taken into account when designing 
and/or utilizing oligos of the invention. Accordingly, it 
is within the scope of the present invention to encompass 
such variants, with respect to the MOAT sequences 
disclosed herein or the oligos targeted to specific 
locations on the respective genes or RNA transcripts. 
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With respect to the inclusion of such variants, the term 
"natural allelic variants" is used herein to refer to 
various specific nucleotide sequences and variants thereof 
that would occur in a human population. The usage of 
different wobble codons and genetic polymorphisms which 
give rise to conservative or neutral amino acid 
substitutions in the encoded protein are examples of such 
variants. Additionally, the term "substantially 
complementary" refers to oligo sequences that may not be 
perfectly matched to a target sequence, but the mismatches 
do not materially affect the ability of the oligo to 
hybridize with its target sequence under the conditions 
described . 

B . Proteins 

Full-length MOAT-B, MOAT-C, MOAT-D and MOAT-E 
proteins of the present invention may be prepared in a 
variety of ways, according to known methods. The proteins 
may be purified from appropriate . sources , e.g., 
transformed bacterial or animal cultured cells or tissues, 
by immunoaf f inity purification. However, this is not a 
preferred method due to the low amount of protein likely 
to be present in a given cell type at any time. The 
availability of nucleic acid molecules encoding MOAT 
proteins enables production of the proteins using in vitro 
expression methods known in the art. For example, a cDNA 
or gene may be cloned into an appropriate in vitro 
transcription vector, such as pSP64 or pSP65 for in vitro 
transcription, followed by cell-free translation in a 
suitable cell -free translation system, such as wheat germ 
or rabbit reticulocytes. In vitro transcription and 
translation systems are commercially available, e.g., from 
Promega Biotech, Madison, Wisconsin or Gibco-BRL, 
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Gaithersburg, Maryland . 

Alternatively, according to a preferred embodiment, 
larger quantities of MOAT proteins may be produced by 
expression in a suitable prokaryotic or eukaryotic system. 
For example, part or all of a DNA molecule, such as a cDNA 
having Sequence I.D. No. 1, 3, 5 or 7 may be inserted into 
a plasmid vector adapted for expression in a bacterial 
cell, such as E . coll. Such vectors comprise the 
regulatory elements necessary for expression of the DNA in 
the host cell positioned in such a manner as to permit 
expression of the DNA in the host cell. Such regulatory 
elements required for expression include promoter 
sequences, transcription initiation sequences and, 
optionally, enhancer sequences. 

The human MOAT proteins produced by gene expression 
in a recombinant procaryotic or eukaryotic system may be 
purified according to methods known in the art. In a 
preferred embodiment, a commercially available 
expression/secretion system can be used, whereby the 
recombinant protein is expressed and thereafter secreted 
from the host cell, to be easily purified from the 
surrounding medium. If expression/secretion vectors are 
not used, an alternative approach involves purifying the 
recombinant protein by affinity separation, such as by 
immunological interaction with antibodies that bind 
specifically to the recombinant protein or nickel columns 
for isolation of recombinant proteins tagged with 6-8 
histidine residues at their N-terminus or C-terminus. 
Alternative tags may comprise the FLAG epitope or the 
hemagglutinin epitope. Such methods are commonly used by 
skilled practitioners. 

The human MOAT proteins of the invention, prepared by 
the aforementioned methods, may be analyzed according to 
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standard procedures. For example, such proteins may be 
subjected to amino acid sequence analysis, according to 
known methods . 

The present invention also provides antibodies 
capable of immunospecif ically binding to proteins of the 
invention. Polyclonal antibodies directed toward human 
MOAT proteins may be prepared according to standard 
methods. In a preferred embodiment, monoclonal antibodies 
are prepared, which react immunospecif ically with the 
various epitopes of the MOAT proteins described herein. 
Monoclonal antibodies may be prepared according to general 
methods of Kohler and Milstein, following standard 
protocols. Polyclonal or monoclonal antibodies that 
immunospecif ically interact with MOAT proteins can be 
utilized for identifying and purifying such proteins. For 
example, antibodies may be utilized for affinity 
separation of proteins with which they immunospecif ically 
interact. Antibodies may also be used to 
immunoprecipitate proteins from a sample containing a 
mixture of proteins and other biological molecules. Other 
uses of anti-MOAT antibodies are described below. 

II. Uses of MOAT -Encoding Nucleic Acids, 
MOAT Proteins and Antibodies Thereto 

Cellular transporter molecules have received a great 
deal of attention as potential targets of chemotherapeutic 
agents designed to effectively block the export of 
pharmacological reagents from tumor cells. The MOAT 
proteins of the invention play a pivotal role in the 
transport of molecules across the cell membrane. 

Additionally, MOAT nucleic acids, proteins and 
antibodies thereto, according to this invention, may be 
used as research tools to identify other proteins that are 
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intimately involved in the transport of molecules into and 
out of cells. Biochemical elucidation of molecular 
mechanisms which govern such transport will facilitate the 
development of novel ant i -transport agents that may 
sensitize tumor cells to conventional chemotherapeut ic 
agents . 

A. MOAT -Encoding Nucleic Acids 

MOAT -encoding nucleic acids may be used for a variety 
of purposes in accordance with the present invention. 
MOAT- encoding DNA, RNA, or fragments thereof may be used 
as probes to detect the presence of and/or expression of 
genes encoding MOAT proteins. Methods in which 
MOAT-encoding nucleic acids may be utilized as 
probes for such assays include, but are not limited to: 
(1) in situ hybridization; (2) Southern hybridization (3) 
northern hybridization; and (4) assorted amplification 
reactions such as polymerase chain reactions (PCR) . 

The MOAT-encoding nucleic acids of the invention may 
also be utilized as probes to identify related genes from 
other animal species. As is well known in the art, 
hybridization stringencies may be adjusted to allow 
hybridization of nucleic acid probes with complementary 
sequences of varying degrees of homology. Thus, 
MOAT-encoding nucleic acids may be used to advantage to 
identify and characterize other genes of varying degrees 
of relation to the MOAT genes of the invention. Such 
information enables further characterization of 
transporter molecules which give rise to the 
chemoresistant phenotype of certain tumors. Additionally, 
they may be used to identify genes encoding proteins that 
interact with MOAT proteins (e.g., by the "interaction 
trap" technique) , which should further accelerate 
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identification of the components involved in the 
acquisition of drug resistance. The MOAT encoding nucleic 
acids may also be used to generate primer sets suitable 
for PCR amplification of target MOAT DNA. Criteria for 
selecting suitable primers are well known to those of 
ordinary skill in the art. 

Nucleic acid molecules, or fragments thereof/ 
encoding MOAT genes may also be utilized to control the 
production of MOAT proteins, thereby regulating - the amount 
of protein available to participate in cytotoxic drug 
efflux. As mentioned above, antisense oligonucleotides 
corresponding to essential processing sites in 
MOAT-encoding mRNA molecules may be utilized to inhibit 
MOAT protein production in targeted cells. Alterations in 
the physiological amount of MOAT proteins may dramatically 
affect the ability of these proteins to transport 
pharmacological reagents out of the cell. 

Host cells comprising at least one MOAT encoding DNA 
molecule are encompassed in the present invention. Host 
cells contemplated for use in the present invention 
include but are not limited to bacterial cells, fungal 
cells, insect cells, mammalian cells, and plant cells. 
The MOAT encoding DNA molecules may introduced singly into 
such. host cells or in combination to assess the phenotype 
of cells conferred by such expression. Methods for 
introducing DNA molecules are also well known to those of 
ordinary skill in the art. Such methods are set forth in 
Ausubel et al . eds . , Current Protocols in Molecular 
Biology, John Wiley & Sons, NY, NY 1995, the disclosure of 
which is incorporated by reference herein. 

The availability of MOAT encoding nucleic acids 
enables the production of strains of laboratory mice 
carrying part or all of the MOAT genes or mutated 
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sequences thereof. Such mice may provide an in vivo model 
for development of novel chemotherapeutic agents. 
Alternatively, the MOAT nucleic acid sequence information 
provided herein enables the production of knockout mice in 
which the endogenous genes encoding MOAT - B , MOAT-C, MOAT-D 
or MOAT-E have been specifically inactivated. Methods of 
introducing transgenes in laboratory mice are known to 
those of skill in the art. Three common methods include : 
1. integration of retroviral vectors encoding the foreign 
gene of interest into an early embryo; 2. injection of 
DNA into the pronucleus of a newly fertilized egg; and 3. 
the incorporation of genetically manipulated embryonic 
stem cells into an early embryo. 

The alterations to the MOAT gene envisioned herein 
include modifications, deletions, and substitutions. 
Modifications and deletions render the naturally occurring 
gene nonfunctional, producing a "knock out 11 animal. 
Substitutions of the naturally occurring gene for a gene 
from a second species results in an animal which produces 
an MOAT gene from the second species. Substitution of the 
naturally occurring gene for a gene having a mutation 
results in an animal with a mutated MOAT protein. A 
transgenic mouse carrying the human MOAT gene is generated 
by direct replacement of the mouse MOAT gene with the 
human gene. These transgenic animals are valuable for use 
in vivo assays for elucidation of other medical disorders 
associated with cellular activities modulated by MOAT 
genes. A transgenic animal carrying a "knock out" of a 
MOAT encoding nucleic acid is useful for the establishment 
of a nonhuman model for chemotherapy resistance involving 
MOAT regulation. 

As a means to define the role that MOAT plays in 
mammalian systems, mice can be generated that cannot make 



26 



WO 99/49735 PCT/US99/06644 

MOAT proteins because of a targeted mutational disruption 
of a MOAT gene. 

The term "animal" is used herein to include all 
vertebrate animals, except humans. It also includes an 
individual animal in all stages of development, including 
embryonic and fetal stages. A "transgenic animal" is any 
animal containing one or more cells bearing genetic 
information altered or received, directly or indirectly, 
by deliberate genetic manipulation at the subcellular 
level, such as by targeted recombination or microinjection 
or infection with recombinant virus. The term "transgenic 
animal" is not meant to encompass classical cross-breeding 
or in vitro fertilization, but rather is meant to 
encompass animals in which one or more cells are altered 
by or receive a recombinant DNA molecule. This molecule 
may be specifically targeted to defined genetic locus, be 
randomly integrated within a chromosome, or it may be 
extrachromosomally replicating DNA. The term "germ cell 
line transgenic animal" refers to a transgenic animal in 
which the genetic alteration or genetic information was 
introduced into a germ line cell, thereby conferring the 
ability to transfer the genetic information to offspring. 
If such offspring in fact, possess some or all of that 
alteration or genetic information, then they, too, are 
transgenic animals. 

The alteration or genetic information may be foreign 
to the species of animal to which the recipient belongs, 
or foreign only to the particular individual recipient, or 
may be genetic information already possessed by the 
recipient. In the last case, the altered or introduced 
gene may be expressed differently than the native gene. 

The altered MOAT gene generally should not fully 
encode the same MOAT protein native to the host animal and 
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its expression product should be altered to a minor or 
great degree, or absent altogether. However, it is 
conceivable that a more modestly modified MOAT gene will 
fall within the compass of the present invention if it is 
a specific alteration. 

The DNA used for altering a target gene may be 
obtained by a wide variety of techniques that include, but 
are not limited to, isolation from genomic sources, 
preparation of cDNAs from isolated mRNA templates, direct 
synthesis, or a combination thereof. 

A preferred type of target cell for transgene 
introduction is the embryonal stem cell (ES) . ES cells 
may be obtained from pre- implantation embryos cultured in 
vitro. Transgenes can be efficiently introduced into the 
ES cells by standard techniques such as DNA transfection 
or by retrovirus-mediated transduction. The resultant 
transformed ES cells can thereafter be combined with 
blastocysts from a non-human animal. The introduced ES 
cells thereafter colonize the embryo and contribute to the 
germ line of the resulting chimeric animal. 

One approach to the problem of determining the 
contributions of individual genes and their expression 
products is to use isolated MOAT genes to selectively 
inactivate the wild-type gene in totipotent ES cells (such 
as those described above) and then generate transgenic 
mice. The use of gene- targeted ES cells in the generation 
of gene-targeted transgenic mice is known in the art. 

Techniques are available to inactivate or alter any 
genetic region to a mutation desired by using targeted 
homologous recombination to insert specific changes into 
chromosomal alleles. However, in comparison with 
homologous extrachromosomal recombination, which occurs at 
a frequency approaching 100%, homologous plasmid- 
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chromosome recombination was originally reported to only 
be detected at frequencies between 10" 6 and 10" 3 . 
Nonhomologous plasmid- chromosome interactions are more 
frequent occurring at levels 10 5 -fold to 10 2 -fold greater 
than comparable homologous insertion. 

To overcome this low proportion of targeted 
recombination in murine ES cells, various strategies have 
been developed to detect or select rare homologous 
recombinants. One approach for detecting homologous 
alteration events uses the polymerase chain reaction (PCR) 
to screen pools of transformant cells for homologous 
insertion, followed by screening of individual clones. 
Alternatively, a positive genetic selection approach has 
been developed in which a marker gene is constructed which 
will only be active if homologous insertion occurs, 
allowing these recombinants to be selected directly. One 
of the most powerful approaches developed for selecting 
homologous recombinants is the positive-negative selection 
(PNS) method developed for genes for which no direct 
selection of the alteration exists. The PNS method is 
more efficient for targeting genes which are not expressed 
at high levels because the marker gene has its own 
promoter. Non- homologous recombinants are selected 
against by using the Herpes Simplex virus thymidine kinase 
(HSV-TK) gene and selecting against its nonhomologous 
insertion with effective herpes drugs such as gancyclovir 
(GANC) or (1- (2-deoxy-2-f luoro-B-D arabinof luranosyl ) - 5 - 
iodouracil, (FIAU) . By this counter selection, the number 
of homologous recombinants in the surviving transf ormants 
can be increased. 

As used herein, a "targeted gene" or "knock-out 11 is a 
DNA sequence introduced into the germline or a non-human 
animal by way of human intervention, including but not 
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limited to, the methods described herein. The targeted 
genes of the invention include DNA sequences which are 
designed to specifically alter cognate endogenous alleles. 

Methods of use for the transgenic mice of the 
invention are also provided herein. Knockout mice of the 
invention can be injected with tumor cells or treated with 
carcinogens to generate carcinomas. Such mice provide a 
biological system for assessing chemotherapy resistance as 
modulated by a MOAT gene of the invention. Accordingly, 
therapeutic agents which inhibit the action of these 
transporters and thereby prevent efflux of beneficial 
chemotherapeutic agents from tumor cells may be screened 
in studies using MOAT knock out mice. 

As described above, MOAT-encoding nucleic acids are 
also used to advantage to produce large quantities of 
substantially pure MOAT proteins, or selected portions 
thereof . 

B. MOAT Proteins and Antibodies 

Purified full length MOAT proteins, or fragments 
thereof, may be used to produce polyclonal or monoclonal 
antibodies which also may serve as sensitive detection 
reagents for the presence and accumulation of MOAT 
proteins (or complexes containing MOAT proteins) in 
mammalian cells. Recombinant techniques enable expression 
of fusion proteins containing part or all of MOAT 
proteins. The full length proteins or fragments of the 
proteins may be used to advantage to generate an array of 
monoclonal antibodies specific for various epitopes of 
MOAT proteins, thereby providing even greater sensitivity 
for detection of MOAT proteins in cells. 

Polyclonal or monoclonal antibodies 
immunologically specific for MOAT proteins may be used in 
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a variety of assays designed to detect and quantitate the 
proteins. Such assays include, but are not limited to: 
(1) flow cytometric analysis; (2) immunochemical 
localization of MOAT proteins in tumor cells; and (3) 
immunoblot analysis (e.g., dot blot, Western blot) of 
extracts from various cells. Additionally, as described 
above, ant i -MOAT antibodies can be used for purification 
of MOAT proteins and any associated subunits (e.g., 
affinity column purification, immunoprecipitation) 

From the foregoing discussion, it can be seen that 
MOAT-encoding nucleic acids, MOAT expressing vectors, MOAT 
proteins and anti-MOAT antibodies of the invention can be 
used to detect MOAT gene expression and alter MOAT protein 
accumulation for purposes of assessing the genetic and 
protein interactions involved in the development of drug 
resistance in tumor cells. 



C. Methods and Kits Employing the 

Composit ions of the Present Invention 

From the foregoing discussion, it can be seen 
that MOAT-encoding nucleic acids, MOAT- expressing vectors, 
MOAT proteins and ant i -MOAT ant ibodies of the invention 
can be used to detect MOAT gene expression and alter MOAT 
protein accumulation for purposes of assessing the genetic 
and protein interactions giving rise to chemotherapy 
resistance in tumor cells. 

Exemplary approaches for detecting MOAT nucleic acid 
or polypeptides/proteins include: 

a) comparing the sequence of nucleic acid in the 
sample with the MOAT nucleic acid sequence to determine 
whether the sample from the patient contains mutations; or 

b) determining the presence, in a sample from a 
patient, of the polypeptide encoded by the MOAT gene and, 
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if present, determining whether the polypeptide is full 
length, and/or is mutated, and/or is expressed at the 
normal level; or 

c) using DNA restriction mapping to compare the 
restriction pattern produced when a restriction enzyme 
cuts a sample of nucleic acid from the patient with the 
restriction pattern obtained from normal MOAT gene or from 
known mutations thereof; or, 

d) using a specific binding member capable of binding 
to a MOAT nucleic acid sequence (either normal sequence or 
known mutated sequence) , the specific binding member 
comprising nucleic acid hybridizable with the MOAT 
sequence, or substances comprising an antibody domain with 
specificity for a native or mutated MOAT nucleic acid 
sequence or the polypeptide encoded by it, the specific 
binding member being labelled so that binding of the 
specific binding member to its binding partner is 
detectable; or, 

e) using PCR involving one or more primers based on 
normal or mutated MOAT gene sequence to screen for normal 
or mutant MOAT gene in a sample from a patient. 

A "specific binding pair" comprises a specific 
binding member (sbm) and a binding partner (bp) which have 
a particular specificity for each other and which in 
normal conditions bind to each other in preference to 
other molecules. Examples of specific binding pairs are 
antigens and antibodies, ligands and receptors and 
complementary nucleotide sequences. The skilled person is 
aware of many other examples and they do not need to be 
listed here. Further, the term "specific binding pair" is 
also applicable where either or both of the specific 
binding member and the binding partner comprise a part of 
a large molecule. In embodiments in which the specific 
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binding pair are nucleic acid sequences, they will be of 
length to hybridize to each other under conditions of the 
assay, preferably greater than 10 nucleotides long, more 
preferably greater than 15 or 20 nucleotides long. 

In most embodiments for screening for alleles giving 
rise to chemotherapy resistance, the MOAT nucleic acid in 
biological sample will initially be amplified, e.g. using 
PCR, to increase the amount of the analyte as compared to 
other sequences present in the sample. This allows the 
target sequences to be detected with a high degree of 
sensitivity if they are present in the sample. This 
initial step may be avoided by using highly sensitive 
array techniques that are becoming increasingly important 
in the art . 

The identification of the MOAT gene and its 
association with a particular chemotherapy resistance 
paves the way for aspects of the present invention to 
provide the use of materials and methods, such as are 
disclosed and discussed above, for establishing the 
presence or absence in a test sample of a variant form of 
the gene, in particular an allele or variant specifically 
associated with chemotherapy resistance. This may be done 
to assess the propensity of the tumor to exhibit 
chemotherapy resistance. 

In still further embodiments, the present invention 
concerns immunodetection methods for binding, purifying, 
removing, quantifying or otherwise generally detecting 
biological components. The encoded proteins or peptides of 
the present invention may be employed to detect antibodies 
having reactivity therewith, or, alternatively, antibodies 
prepared in accordance with the present invention, may be 
employed to detect the encoded proteins or peptides. The 
steps of various useful immunodetection methods have been 
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described in the scientific literature, such as, e.g., 
Nakamura et al . (1987). 

In general, the immunobinding methods include 
obtaining a sample suspected of containing a protein, 
peptide or antibody, and contacting the sample with an 
antibody or protein or peptide in accordance with the 
present invention, as the case may be, under conditions 
effective to allow the formation of immunocomplexes . 

The immunobinding methods include methods for 
detecting or quantifying the amount of a reactive 
component in a sample, which methods require the detection 
or quantitation of any immune complexes formed during the 
binding process. Here, one would obtain a sample 
suspected of containing a MOAT gene encoded protein, 
peptide or a corresponding antibody, and contact the 
sample with an antibody or encoded protein or peptide, as 
the case may be, and then detect or quantify the amount of 
immune complexes formed under the specific conditions. 

In terms of antigen detection, the biological sample 
analyzed may be any sample that is suspected of containing 
the MOAT antigen, such as a tumor tissue section or 
specimen, a homogenized tissue extract, an isolated cell, 
a cell membrane preparation, separated or purified forms 
of any of the above protein-containing compositions. 

Contacting the chosen biological sample with the 
protein, peptide or antibody under conditions effective 
and for a period of time sufficient to allow the formation 
of immune complexes (primary immune complexes) is 
generally a matter of simply adding the composition to the 
sample and incubating the mixture for a period of time 
long enough for the antibodies to form immune complexes 
with, i.e., to bind to, any antigens present. After this 
time, the sample-antibody composition, such as a tissue 
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section, EL ISA plate, dot blot or Western blot, will 
generally be washed to remove any non- specif ically bound 
antibody species, allowing only those antibodies 
specifically bound within the primary immune complexes to 
be detected. 

In general, the detection of immunocomplex formation 
is well known in the art and may be achieved through the 
application of numerous approaches. These methods are 
generally based upon the detection of a label or marker, 
such as any radioactive, fluorescent, biological or 
enzymatic tags or labels of standard use in the art. U.S. 
Patents concerning the use of such labels include U.S. 
Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 
4,277,437; 4,275,149 and 4,366,241, each incorporated 
herein by reference. Of course, one may find additional 
advantages through the use of a secondary binding ligand 
such as a second antibody or a biotin/avidin ligand 
binding arrangement, as is known in the art. 

In one broad aspect, the present invention 
encompasses kits for use in detecting expression of MOAT 
encoding nucleic acids in biological samples, including 
biopsy samples. Such a kit may comprise one or more pairs 
of primers for amplifying nucleic acids corresponding to 
the MOAT gene. The kit may further comprise samples of 
total mRNA derived from tissues expressing at least one or 
a subset of the MOAT genes of the invention, to be used as 
controls. The kit may also comprise buffers, nucleotide 
bases, and other compositions to be used in hybridization 
and/or amplification reactions. Each solution or 
composition may be contained in a vial or bottle and all 
vials held in close confinement in a box for commercial 
sale. In a further embodiment, the invention encompasses 
a kit for use in detecting MOAT proteins in chemotherapy 
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resistant cancer cells comprising antibodies specific foi 
MOAT proteins encoded by the MOAT nucleic acids of the 
present invention . 

Another aspect of the present invention comprises 
screening methods employing host cells expressing one or 
more MOAT genes of the invention. An advantage of having 
discovered the complete coding sequenced of MOAT B-E 
is that cell lines that overexpress MOATB C D or E can be 
generated using standard transfection protocols. Cells 
that overexpress the complete cDNA will also harbor the 
complete proteins, a feature that is essential for 
biological activity of proteins. The overexpress ing cell 
lines will be useful in several ways: l)The drug 
sensitivity of overexpress ing cell lines can be tested 
with a variety of known anticancer agents in order to 
determine the spectrum of anticancer agents for which the 
transporter confers resistance; 2) The drug sensitivity of 
overexpressing cell lines can be used to 

determine whether newly discovered anticancer agents are 
transported out of the cell by one of the discovered 
transporters; 3 ) Overexpressing cell lines can be used to 
identify potential inhibitors that reduce the activity of 
the transporters. Such inhibitors are of great 
clinical interest in that they may enhance the activity of 
known anticancer agents, thereby increasing their 
effectiveness. Reduced activity will be detected by 
restoration of anticancer drug sensitivity, or by 
reduction of transporter mediated cellular efflux of 
anticancer agents. In vitro biochemical studies designed 
to identify reduced transporter activity in 

the presence of potential inhibitors can also be performed 
using membranes prepared from overexpessing cell lines; 
and 4) Overexpressing cell lines can also be used to 
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determine whether pharmaceutical agents that are not 
anticancer agents are transported out of the cell by the 
transporters. 

The following protocols are provided to facilitate 
the practice of the present invention. 

Isolation of MOAT-B cDNA 

Forward {CT(A/G/T) GT (A/G/T) GC(A/G/T) GT ( A/G/T) 
GT(A/G/T) GG(A/G/C/T) } (SEQ ID NO : 9 ) and reverse {(G/A)CT 
(A/G/C/T) A(A/G/C) (A/G/C/T)GC (A/G/C/T) (G/C) (T/A) 
(A/G/C/T) A(A/G) (A/G/C/T)GG (A/G/C/TJTC (A/G)TC} (SEQ ID 
NO: 16) degenerate oligonucleotide primers were designed 
based upon the first nucleotide binding folds of human 
MRP, CFTR, and MDR1 . Bacteriophage DNA isolated from a 
C200 cDNA library prepared in the XpCEV2 7 phagemid 
vector (17) was used as template in PCR reactions 
containing 25 0 ng cDNA, 5 uM primers, 50 mM KC1, 10 mM 
Tris-HCl, pH 8.3, 3 mM MgCl 2 , .05% gelatin, 0 . 2 mM dNTP 
and Taq polymerase (Perkin Elmer Cetus) . Five cycles of 
PCR were performed as follows: 94°C for 1 minute, 40°C 
for 2 minutes, 72°C for 3 minutes. Twenty five cycles 
were then performed as follows: 94°C for 1 minute, 55°C 
for 1 minute, and 72°C for 1 minute. The resulting 
reaction products were used as template in a second 
round of PCR, as described above, with nested forward 
{CGGGATCC AG ( A/G) GA(A/G) AA(C/T) AT(A/C/T) CT (A/G/C/T) 
TTT GG(A/G/C/T) } (SEQ ID N0:17) and reverse { CGGAATTC 
(A/G/T/OTC (A/G)TC (A/C/T)AG (A/G/C/T)AG (A/G) TA 
(A/T/G) AT (A/G)TC}(SEQ ID NO:18) degenerate 
oligonucleotide primers. PCR reaction products were 
isolated from an agarose gel and subcloned into the 
BamHI and EcoRI sites of pBluescript (Stratagene) . 
Nucleotide sequence analysis 
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was performed on plasmid DNA prepared from ampicillin 
resistant transf ormant s . Additional cDNA clones were 
isolated from C200 (ovary) and B5 . (breast) cDNA libraries 
by plaque hybridization using the PCR product as the 
initial radiolabeled probe. 

RNA Blot Analysis 

Blots containing polyA* RNA isolated from human 
tissues (Clontech) were prehybridized at 45°C for 8 hours 
in 50% formamide, 4X SSC , 4X Denhardt 1 s solution, 0.04 M 
sodium phosphate monobasic, pH 6.5, 0.8% (w/v) glycine, 
0.1 mg/ml sheared denatured salmon sperm DNA. 
Hybridization was performed at 45 °C with 32 P-labeled MOAT - B 
or GAPDH probes in a solution containing 50% formamide, 3X 
SSC, 0.04 M sodium phosphate pH 6.5, 10% dextran sulfate, 
0.1 mg/ml sheared denatured salmon sperm DNA. Blots were 
washed 2 times for 15 min at 65°C in 2X SSC, 5 mM Tris-HCl 
PH7.4, 0.5% SDS, 2.5 mM EDTA, 0.1% sodium pyrophosphate pH 
8.0, and subsequently washed 2 times for 15 min in 0 . IX 
SSC. Blots were then subjected to autoradiography. 

Chromosomal localization 

Preparation of metaphase spreads from 
phytohemagglutinin-stimulated lymphocytes of a healthy 
female donor, and fluorescence in situ hybridization and 
detection of immunofluorescence were carried out as 
previously described (18) . A 2.2-kb cDNA clone of MOAT-B 
inserted in pBluescript was biotinylated by nick 
translation in a reaction containing 1 fxg DNA, 20 fiM each 
of dATP, dCTP and dGTP, 1 fM dTTP, 2 5 mM Tris-HCl, pH 7.5, 
5 mM MgCl 2 , 10 mM 6-mercaptoethanol , IQfxM biotin-16-dUTP 
(Boehringer Mannheim) , 2 units DNA polymerase 1/DNase 1 
(GIBCO, BRL) and water to a total volume of 50 fil . The 
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probe was denatured and hybridized to metaphase spreads 
overnight at 37°C. Hybridization sites were detected with 
f luorescein-labeled avidin (Oncor) and amplified by 
addition of anti -avidin antibody (Oncor) and a second 
layer of fluorescein- labeled avidin. The chromosome 
preparations were counterstained with DAPI and observed 
with a Zeiss Axiophot epif luorescence microscope equipped 
with a cooled charge coupled device camera (Photometries, 
Tucson AZ) operated by a Macintosh computer work station. 
Digitized images of DAPI staining and fluorescein signals 
were captured, pseudo-colored and merged using Oncor Image 
version 1.6 software. 



Isolation of MOAT-C and MOAT-D cDNA 

MOAT-C and MOAT-D cDNA clones were isolated by plaque 
hybridization from bacteriophage cDNA libraries using the 
I.M.A.G.E. clones as the initial probes (ATCC) . 

RNA blot analysis 

Blots containing polyA + RNA isolated from human 
tissues (Clontech) were purchased from Clontech, and 
hybridized with radiolabeled MOAT-C, MOAT-D or actin 
probes according to the manufacturer's directions. 

Chromosomal localization 

Preparation of metaphase spreads from 
phytohemagglutinin-stimulated lymphocytes of a healthy 
female donor, and fluorescence in situ hybridization and 
detection of immunofluorescence were carried out as 
previously described (18) . A MOAT-C probe inserted in 
pBluescript, or MOAT-D probe inserted in pBluescript, was 
biotinylated by nick translation in a reaction containing 
1 fig DNA, 2 0 MM each of dATP, dCTP and dGTP, 1 fiM dTTP, 2 5 
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mM Tris-HCl, pH 7.5, 5 mM MgCl 2 , 10 mM E-mercaptoethanol, 
10/zM biotin-16-dUTP (Boehringer Mannheim) , 2 units DNA 
polymerase 1/DNase 1 (GIBCO, BRL) and water to a total 
volume of 50 pi . The probe was denatured and hybridized 
to metaphase spreads overnight at 37°C. Hybridization 
sites were detected with f luorescein-labeled avidin 
(Oncor) and amplified by addition of ant i -avidin antibody 
(Oncor) and a second layer of fluorescein- labeled avidin. 
The chromosome preparations were count erst ained with DAPI 
and observed with a Zeiss Axiophot epif luorescence 
microscope equipped with a cooled charge coupled device 
camera (Photometries, Tucson AZ) operated by a Macintosh 
computer work station. Digitized images of DAPI staining 
and fluorescein signals were captured, pseudo- colored and 
merged using Oncor Image version 1.6 software. 

The following examples are provided to illustrate 
various embodiments of the invention. They are not 
intended to limit the invention in any way. 

EXAMPLE X 
Isolation of MOAT - B cDNA. 

A degenerate PCR approach was used to isolate 
MRP-related transporters. Degenerate oligonucleotide 
primers were prepared based upon the N-terminal nucleotide 
binding folds of MRP and other eukaryotic transporters, 
and used in conjunction with DNA prepared from an ovarian 
cancer cell line bacteriophage library. Nucleotide 
sequence analysis of one of the resulting PCR products 
indicated that it encoded a segment of a novel nucleotide 
binding fold that was most closely related to MRP and 
cMOAT. Overlapping cDNA clones were isolated from ovarian 
and breast bacteriophage libraries by plaque hybridization 
using the PCR product as the initial probe. A total of 
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5.9 kB of cDNA was isolated. Nucleotide sequence analysis 
revealed two classes of cDNA clones that were about 
equally represented among isolates from each of the two 
bacteriophage libraries. The first class contained an 
open reading frame of 3975 bp that was bordered by in 
frame stop codons located at positions -76 and -42 
(relative to the putative initiation codon) and 3976, and 
encoding a predicted protein of 1325 amino acids, which is 
designated MOAT - B . The open reading frame was followed by 
approximately 2 kB of 3 1 untranslated sequences. The most 
upstream ATG in the open reading frame was located in the 
sequence context " 4 CAAGATGC +4 . The A at position -3 of the 
putative translation initiation codon was in agreement 
with the major feature of the Kozak consensus sequence, 
but the C at position +4 was divergent from the more usual 
G. The second class of cDNA clones was identical to the 
first with the exception of a single nucleotide. These 
clones harbored an additional T following nucleotide 3872 
of the first class of clones, close to the C- terminus of 
the predicted protein. This additional nucleotide 
resulted in a frame shift such that the predicted protein 
of the second class of cDNA clones was 22 residues shorter 
than that of the first class of cDNA clones, and in which 
the C-terminal 34 residues of the latter reading frame 
were replaced by 12 distinct residues. See brief 
description of Figure 1 . 

Analysis of the MOAT - B Predicted Structure. 

Comparison of the MOAT-B predicted protein with 
complete coding sequences in protein data bases using the 
BLAST program indicated that it shared significant 
similarity with several eukaryotic ABC transporters. 
Table I . 
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Table I. Comparison of peptide domains of MOAT-B with 
those of other eukaryotic ABC transporters 



MOAT-B 
Domain 
(peptide) 



TM1 NBF1 linker TM2 

region 

(88-376) (428-576) (577-705) (706-992) 



NBF2 c- overall 

terminus identity 
(1058- (1217- 
1216) 1325) 



percent identity 



MRP human 


28.6 


55.6 


27 


.9 


33.3 


61.6 


51.6 


39.2 


YCF1 yeast 


27 


56 


27 


.9 


34 


57.2 


48.5 


38.9 


MOAT human 


33.2 


53 .3 


32. 


.8 


31.4 


55.3 


44 .9 


38 


CFTR Human 


30.5 


48 


27. 


.9 


37.7 


44 


21 


36.3 


SUR rat 


28.1 


41.3 


28. 


,2 


30 


52.8 


42.8 


32.9 


MDR1 human 


17.6 


39.2 


21. 


, 1 


17.3 


32.2 


40.3 


23.3 



The indicated domains are, TM1 : segment containing the 
transmembrane spanning domain N-terminal to NBF1; NBF1 and NBF2 : 
nucleotide binding folds 1 and 2; Linker region: segment located 
between NBF1 and TM2 ; TM2 : segment containing the transmembrane spanning 
domain located between the two NBFs ; C-terminus : segment between NBF2 
and the C-terminus of the proteins. Sequence alignments were generated 
using the PILEUP program of the GCC package. Percent amino acid 
identity with MOAT-B domains are shown. 



Typical features of eukaryotic ABC transporters were 
present in the predicted MOAT-B protein. See Figure 1. 
Overall the protein was composed of a tandem repeat of a 
nucleotide binding fold appended C- terminal to a 
hydrophobic domain that contained several potential 
transmembrane spanning helices. Conserved Walker A and B 
ATP binding sites were present in each of the nucleotide 
binding folds. See Figure 2A. In addition, a conserved C 
motif, the signature sequence of ABC transporters, was 
present in each nucleotide binding fold. Analysis of 
potential transmembrane motifs using the TMAP program (19) 
and an input sequence alignment of MOAT-B and MOAT-C, a 
transporter highly related to MOAT-B 4 , predicted 12 
transmembrane helices with 6 transmembrane segments in 
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each of the two hydrophobic domains. This 6 + 6 
configuration of predicted transmembrane helices is in 
agreement with topological models proposed for MRP and 
other ABC transporters (20, 21), and is shown in Figure 1. 
However, alternative predictions of transmembrane segments 
were obtained using different program parameters or input 
sequence alignments. For example, when the TMAP program 
was used with an input sequence alignment consisting of 
human MRP , rat cMOAT, rat sulfonyl urea receptor (SUR) , 
human cystic fibrosis conductance regulator (CFTR) and 
human P-glycoprotein, a 6+5 configuration was 
predicted. The only substantial difference between the 
latter prediction and the structure shown in Figure 1 is 
that transmembrane segments 9 (829-853) and 10 (855-878) 
were replaced by a single predicted transmembrane segment 
spanning amino acids 847 - 875. 

Among ABC transporters, the degree of similarity of 
the nucleotide binding folds is considered to be the best 
indicator of functional conservation. Comparison of the 
nucleotide binding folds of MOAT - B with other eukaryotic 
ABC transporters indicated that it was most closely 
related to MRP, the yeast cadmium resistance protein 
(YCF1) and cMOAT (Table I), three transporters that have 
organic anions as substrates. The MOAT-B NBF1 was 55.6, 
56.0 and 53.3 percent identical, and the MOAT-B NBF2 was 
61.6, 57.2 and 55.3 percent identical to the first and 
second nucleotide binding folds of human MRP, YCF1 and 
human cMOAT, respectively. Aside from the latter 
transporters, the MOAT-B nucleotide binding folds were 
most closely related to those of CFTR and SUR. The MOAT-B 
nucleotide binding folds shared significantly less 
similarity with those of MDR1 . Alignment of the MOAT-B 
nucleotide binding folds with those of other eukaryotic 
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transporters is shown in Figure 2A. Analysis of the 
overall amino acid identity of MOAT - B with other ABC 
transporters also indicated that it was most closely 
related to MRP, YCF1 and cMOAT (Table I) . Overall MOAT-B 
was 39.2, 38.9 and 38 percent identical to these 
transporters, respectively. Figure 2B shows a comparison 
of the hydropathy profiles of MOAT-B with those of other 
eukaryotic transporters. This comparison reveals that 
MOAT-B (1325 amino acids) is approximately 200 amino acids 
smaller than MRP (1531 residues), cMOAT (1545 residues) 
and YCF1 (1515 residues) , and that this size difference is 
largely accounted for by the absence in MOAT-B of an amino 
terminal hydrophobic extension that is present in MRP, 
cMOAT and YCF1 (22) . This N- terminal hydrophobic segment 
is predicted to harbor several transmembrane spanning 
segments, and is also present in SUR. 
Expression Pattern of MOAT-B in Human Tissues* 

To gain insight into the possible function of MOAT-B, 
its expression pattern in a variety of human tissues was 
examined by RNA blot analysis. As shown in Figure 3, a 
MOAT-B transcript of approximately 6 kB was readily 
detected. The isolation of 5.9 kB of MOAT-B cDNA was 
consistent with this size. MOAT-B expression was detected 
in each of the 16 tissues analyzed. Transcript levels were 
highest in prostate and lowest in liver and peripheral 
blood leukocytes, for which prolonged exposure of film 
were required to detect expression. Intermediate levels of 
expression were observed in other tissues. 
Chromosomal Localization of the MOAT-B Gene, 

The MOAT-B chromosomal localization was determined by 
fluorescence in situ hybridization. As shown in Figure 4, 
hybridization of the MOAT-B probe to metaphase spreads 
revealed specific labeling at human chromosome band 13q32. 
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Fluorescent signals were detected on chromosome 13 in each 
of 19 metaphase spreads scored. Of 135 signals observed, 
62 (46%) were on I3q. Among these signals, 61 localized 
at 13q32, near the boundary between 13q31 and 13q32 . 
Paired (on sister chromatids) signals were only seen at 
band 13q32 . In several metaphases, signals on a single 
chromatid were observed at chromosome bands 6p21 or 4q21, 
suggesting hybridization to distantly related sequences. 

EXAMPLE II 
Isolation of MOAT-C and MOAT-D cDNA. 

Isolation of the MOAT-B 4 transporter as described 
above suggested the possibility that there were other 
MRP/cMOAT-related transporters. A blast search (36) of 
the nonredundent expressed sequence tag data base using 
MRP and related yeast transporters revealed two clones 
with significant similarity to MRP and cMOAT. The first 
of these sequences (I.M.A.G.E. consortium clone 113196) 
was 1.2 kb in length, 800 bp of which encoded an 
MRP-related peptide. A segment of this clone was used as 
a probe to screen ovarian and hematopoietic bacteriophage 
libraries. Analysis of these cDNA clones indicated that 
they contained approximately 2 kb of additional coding 
sequence not present in clone 113196. An additional 1655 
bp of 5 * sequence was obtained by several rounds of RACE 
using the bacteriophage DNA prepared from the ovarian cDNA 
library as template. The continuity of the sequences 
obtained by RACE with the cDNA clones isolated from 
bacteriophage libraries was confirmed by nucleotide 
sequence analysis of a 2 kb product obtained by RT/PCR 
using an upstream oligonucleotide primer located at the 5' 
end of the RACE sequence and a downstream primer located 
at the 5 ' end of the cDNA obtained by plaque 
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hybridization. A total of approximately 5.9 kb of cDNA 
sequences were isolated. Nucleotide sequence analysis 
revealed an open reading frame of 4311 bp that was 
preceded by an in frame stop codon located at positions 
-93 (relative to the putative initiation codon) , and 
encoding a predicted protein of 1437 amino acids, which is 
designated MOAT-C herein. The open reading frame was 
followed by approximately 1.4 kB of 3' untranslated 
sequences in which a polyadenylation sequence (AAUAAA) was 
located 20 bp upstream of the poly (A) tail. The most 
upstream ATG in the open reading frame was located in the 
sequence context ~ 4 GAAGATGA* 4 . The A at position -3 of the 
putative translation initiation codon was in agreement 
with the major feature of the Kozak consensus sequence, 
but the A at position +4 was divergent from the more usual 
G (37) . The second sequence identified in our data base 
search (I.M.A.G.E. consortium clone 208097) was 1.2 kb in 
length, of which 588 bp encoded an MRP-related peptide. A 
segment of this clone was used as a probe to screen liver 
and monocyte bacteriophage cDNA libraries, and 5' cDNA 
segments of the isolated cDNA clones were used in a 
subsequent round of screening. Together approximately 5.2 
kb of cDNA sequence were isolated. Nucleotide sequence 
analysis revealed an open reading frame of 4570 bp, which 
is designated MOAT-D herein. The open reading frame was 
followed by approximately 0 . 6 kb of 3 1 untranslated 
sequences in which a polyadenylation sequence (AAUAAA) was 
located 12 bp upstream of the poly (A) tail. An upstream 
in frame stop codon was not present in the MOAT-D cDNA 
clones, and attempts to obtain additional upstream 
sequences by RACE using as template cDNA prepared from 
sources in which MOAT-D is abundant were not successful. 
The most upstream ATG in the open reading frame 
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(nucleotide position 5-7) , located in the sequence context 
' 4 ATGGATGG + \ was therefore designated as the translational 
initiation site. The G at position +4, was in good 
agreement with the Kozak consensus sequence, but the T at 
-3 was divergent from the more usual A- (37) . Although an 
upstream in frame stop codon was not identified in the 
MOAT - D cDNA clones, the size of the encoded protein was 
within one amino acid of the size of the transporter with 
which it shares the highest degree of identity (MRP) , 
suggesting that the complete MOAT - D open reading frame was 
present in the isolated cDNA clones. 

Analysis of the MOAT-C and MOAT-D Predicted Proteins. 

Comparison of the MOAT-C and MOAT-D predicted 
proteins with complete coding sequences in protein data 
bases using the BLAST program indicated that they shared 
significant similarity with several eukaryotic ABC 
transporters. Typical features of eukaryotic ABC 
transporters were present in the predicted proteins. See 
Figure 5. Overall the proteins were composed of 
hydrophobic domains containing potential transmembrane 
spanning helices and two nucleotide binding folds. 
Conserved Walker A and B ATP binding sites, as well as a 
conserved C motif, the signature sequence of ABC 
transporters, was present in the nucleotide binding folds. 
Computer assisted analysis of potential transmembrane 
helices of MOAT-C using the TMAP program (19) predicted 12 
transmembrane helices with 6 transmembrane spanning 
helices in each of two membrane spanning domains. This 6 
+ 6 (TM1-TM6 and TM7-TM12) configuration of predicted 
transmembrane helices is in agreement with topological 
models proposed for several other ABC transporters (2 0, 
21), and is shown in Figure 5. However, alternative 
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predictions of transmembrane segments were obtained using 
different program parameters or input sequence alignments. 
Comparison of the hydropathy profiles of MOAT-C with other 
MRP/cMOAT-related transporters (Fig. 6B) indicates that 
its structure is similar to that of MOAT-B, which also has 
two membrane spanning domains. 

In contrast to MOAT-C, hydrophobicity analysis of 
MOAT - D indicated that it has three membrane spanning 
domains. Similar to MRP, cMOAT and the yeast cadmium 
resistance factor 1 (YCF1) , MOAT-D has an additional 
N-terminal hydrophobic domain that is not present in 
MOAT-B or MOAT-C (Figs. 5 and 6). A 5+6+6 configuration 
of transmembrane spanning helices has been proposed for 
MRP (38 ) , in which the N-terminal extension harbors 5 
transmembrane spanning helices, and 6 transmembrane 
helices are present in the second and third membrane 
spanning domain. An alignment of the MOAT-D predicted 
protein with MRP using the GAP program indicated that 
proposed MRP transmembrane spanning helices were conserved 
in MOAT-D. This 5+6+6 model for MOAT-D is shown in Fig. 5. 
Another configuration of transmembrane spanning helices 
(5+6+4) was predicted using computer assisted analysis. 
MRP has been reported to have two N- linked glycosylat ion 
sites in its N-terminus (Asn-19 and Asn-23) and another 
site located between the first and second transmembrane 
spanning helix of its third membrane spanning domain 
(Asn-1006) . The alignment of MOAT-D with MRP indicated 
that an N-terminal (Asn-21) and a distal N-glycosylat ion 
sites (Asn-1008/1009) were conserved in analogous 
positions in MOAT-D. Only the distal N-glycosylat ion site 
of MRP is conserved in MOAT-C (Asn890) (Fig. 5) and MOAT-B 4 
(Asn746/754) . 

Among ABC transporters, the degree of similarity of 
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the nucleotide binding folds is considered to be the best 
indicator of functional conservation. Comparison of the 
nucleotide binding folds of MOAT-C and MOAT-D with other 
eukaryotic ABC transporters indicated that they were most 
closely related to those of human MRP , human cMOAT and 
yeast YCF1, three transporters that have organic anions as 
substrates. As shown in Table 2, among the human 
transporters, the MOAT-C NBF1 was about equally related to 
MOAT-D, MRP and cMOAT (55-61% identity), and less similar 
to MOAT-B (49% identity) . 

Table II. Amino acid identity: nucleotide binding folds 1 
and 2 of MRP/cMOAT sub-family members. 



MOAT-C MOAT-D MOAT-B MRP cMOAT 



% IDENTIFY (BNF1/NBF20) 



MOAT-C 




57 


.3/58 


.9 


49. 


3/59 


.1 


60. 


.0/59.4 


61 


.3/60 


.6 


55 


.3/58/8 


MOAT-D 


57.3/58/9 








55. 


3/54 


.1 


70 


.173.8 


67 


.3/70 


.0 


52 


.7/61.3 


MOAT-B 


49.3/59.1 


55. 


.3/54 , 


. 1 








57 . 


3/61/6 


53 . 


.3/55 


.3 


56, 


.0/57.2 


MRP 


60.0/59.4 


70. 


.7/73 . 


.7 


57. 


3/61 


.6 






66. 


.0/73 


.1 


53 . 


.3/63.8 


CMOAT 


61/3/60.6 


67 , 


.3/70. 


0 


53 . 


3/55, 


.3 


66. 


0/73.1 








50, 


,7/61/3 


YCF1 


55.3/58.8 


52 . 


7/61 . 


.3 


56. 


0/57 . 


.2 


53 . 


3/63 .8 


50 . 


7/61. 


.3 







The MOAT-C NBF2 shared about equal amino acid identity 
with the five other transporters in this group (59-61% 
identity) . Overall, the MOAT-C protein was about equally 
related to the other five transporters in this group, with 
33.1-36.5% identity. Aside from these 
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transporters, MOAT-C is most closely related to CFTR, with 
which its NBFs shared 44%/42 % identity, and SUR, with 
which its NBFs shared 49%/51% identity. 

The MOAT-D NBFs were clearly most closely related to 
those of MRP and cMOAT , with which they shared considerable 
amino acid identity (67.3-73.8%). See Table III. Of the 
latter two transporters, the MOAT-D NBFs were slightly more 
related to those of MRP. In contrast, the MOAT-D NBFs 
shared only 55.3-58.9% identity with those of MOAT-C and 
MOAT-B. Overall, MOAT-D was again most closely related to 
MRP (57.3%) and cMOAT (46.9%), but significantly more 
related to MRP. Consistent with the analysis of NBFs, 
MOAT-D was much less related to MOAT-C and MOAT-B, with 
which it shared only 33.1% and 35.3% identity, 
respectively. Alignment of the MOAT-C and MOAT-D nucleotide 
binding folds with those of other eukaryotic transporters 
is shown in Fig. 6. 



Table III. Overall amino acid identifying among MRP/cMOAT 
sub- family members 





MOAT-C 


MOAT-D 


MOAT-B 


MRP 


cMOAT 


YCF1 


%identity 


MOAT-C 




33.1 


36.5 


35.8 


36.2 


33 . 6 


MOAT-D 


33.1 




35.3 


57 . 3 


46.9 


38 . 1 


MOAT-B 


36.4 


35.3 




39.4 


36.8 


38.8 


MRP 


35.8 


57.3 


39.4 




48.4 


46.4 


cMOAT 


36.3 


46 . 9 


36.8 


48.8 




38.8 


YCF1 


33 . 6 


38.1 


38. 8 


40.4 


38.8 





Expression Pattern of MOAT-C and MOAT-D in Human Tissues. 

To gain insight into the possible functions of MOAT-C 
and MOAT-D, their expression patterns in a variety of human 
tissues was examined by RNA blot analysis. As 
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shown in Fig. 7 (upper panels), a MOAT-C transcript of 
approximately 6.6 kB was readily detected in several 
tissues. MOAT-C transcript levels were highest in 
skeletal muscle, with intermediate levels in kidney, 
testes, heart and brain. Low levels were detected in most 
other tissues, including spleen, thymus, prostate, ovary, 
and placenta. Prolonged exposures were required for 
detection in lung and liver. MOAT - D was expressed as an 
approximately 6 kb transcript (middle panels) . Compared 
to MOAT-C, the MOAT - D expression pattern was more 
restricted. MOAT-D was highly expressed in colon and 
pancreas, with lower levels in liver and kidney. Low 
levels were detected in small intestine, placenta and 
prostate. Prolonged exposures were required to detect 
MOAT- D in testes, thymus, spleen and lung. 

Chromosomal localization of the MOAT-C and MOAT-D genes. 

The MOAT-C and MOAT-D chromosomal localizations 
were determined by fluorescence in situ hybridization. As 
shown in Figure 8, hybridization of the MOAT-C probe to 
metaphase spreads revealed specific labeling at human 
chromosome band 3q27. Fluorescent signals were detected 
on chromosome 3q in each of 22 metaphase spreads scored. 
Of 75 signals observed, 43 (57%) were on 3q. Paired (on 
sister chromatids) signals were only seen at band 3q27. 
Hybridization of the MOAT-D probe revealed specific 
labeling at human chromosome band 17q21.3. Fluorescent 
signals were detected on chromosome 17 in each of 21 
metaphase spreads scored. Of 83 signals observed, 34 
(41%) were on 17q21.3. Paired (on sister chromatids) 
signals were only seen at band 17q21.3. 
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EXAMPLE III 
Isolation of MOAT-E and MOAT - E cDNA . 

Analysis of ara, a reported cDNA sequence that 
encodes a 453 amino acid transporter, revealed that it is 
a non-physiological sequence representing a combination 
of 5' MRP sequences fused to an MRP/cMOAT-related 
transporter. The MRP sequences extend to codon 8 of the 
reported predicted protein. 

To isolate the complete physiological cDNA, a RT/PCR 
approach was employed in which primers were designed 
based upon a reported genomic sequence that encodes exons 
identical to the reported ara sequence. The MOAT-E cDNA 
was isolated in three segments. The first segment, 
spanning residues 1-616, was isolated by PCR using 5' 
primer ATGGCCGCGCCTGCTGAGC ; (SEQ ID NO: 10) and 3' primer 
GTCTACGACACCAGGGTCAA (SEQ ID NO: 11) . The second 
segment, spanning residues 1815-3187, was isolated by PCR 
using 5 1 CTGCCTGGAAGAAGTTGACC (SEQ ID NO: 12) and 3' 
primer CTGGAATGTCCACGTCAACC (SEQ ID NO: 13) . The third 
segment, spanning residues 3158-1503, was isolated by PCR 
using 5' primer GGAGACAGACACGGTTGACG (SEQ ID NO: 14) and 
3' primer GCAGACCAGGCCTGACTCC (SEQ ID NO : 15). The 
primer were designed based upon the nucleotide sequence 
of human genomic BAC clone CIT987SD- 962B4 . The template 
for these reactions was random-primed human kidney cDNA 
prepared from total RNA. Using this approach the 
physiological cDNA was isolated which is designated 
MOAT-E herein and set forth as Sequence I.D. No. 7. 

Analysis of the MOAT-E Predicted Protein. 

MOAT-E encodes a 1503 amino acid transporter. The 
MOAT-E predicted amino acid sequence is designated 
Sequence I.D. No. 8. See Figure 9. Also shown is the 
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location of potential transmembrane helices (overbars) , 
potential N-glycosylation site (black dot) and the two 
nucleotide binding folds (NBF1 and NBF2) . Walker A and B 
motifs, as well as the signature C motif of ABC 
transporters are also indicated. Comparison of MOAT-E 
with ara indicates that the ara predicted protein is not 
only a fused sequence, but also that it represents only 
446 (-30%) of the 1503 MOAT-E residues. 

Comparison of MOAT-E with the other members of the 
MRP/cMOAT subfamily, which include MRP , cMOAT, MOAT-B, 
MOAT-C and MOAT-E, is shown in Table IV. MOAT-E is 
highly related to MOAT-D, MRP and cMOAT, with which it 
shares 39-45% identity. This high degree of identity is 
also indicated by the high percent identities of the 
nucleotide binding folds, which range from 55-61%. In 
contrast, MOAT-E is less related to MOAT-B and MOAT-C, 
with which it shares -31% and 34% identity, respectively. 

Table IV. Amino acid identity among MRP/cMOAT sub- family 
members. 3 The bold type indicates the percent identity of 
the overall proteins, and the parentheses indicates the 
percent identity of the nucleotide bindina 
folds. 



MOAT-E MOAT-B MOAT-C MOAT-D MRP 



cMOAT 



% identity 1 * 



MOAT-E 










33.9 






30.6 






43.6 






45.1 






38.9 












(52 


.0/56 


.6) 


(50 


.0/52 


.5) 


(59 


.3/59 


.4) 


(61 


-3/61 


.4) 


(55 


.3/59 


.4} 


MOAT-B 




33.9 












36.4 






35.3 






39.4 






36.8 






(52 


.0/56 


.6) 








(49 


.3/59 


.1) 


(55 


-3/54 


.1) 


(57 


.3/61 


.6) 


(56 


.0/57 


2) 


MOAT-C 




30.0 






36.4 












33.1 






35.6 






36.2 






(50 


.0/52 


.5) 


(49 


.3/59 


.1) 








(57 


.3/58 


.9) 


(60 


.6/59, 


.4) 


(61 


-3/60. 


6) 


MOAT-D 




43.6 






35.3 






33.1 












57.3 






46.9 






(59 


.3/59. 


.4) 


(55 


.3/54. 


.1) 


(57 


.3/58. 


.9) 








(70 


7/73. 


8) 


(67 


.3/70. 


0) 


MRP 




45.1 






39.4 






35.8 






57.3 












48.4 






(61 


-3/61. 


9) 


(57 


.3/61, 


6) 


(60 


.0/59. 


4) 


(70 


-7/73. 


8) 








(66 


.0/73 


1) 


CMOAT 




38.9 






36.8 






36.2 






46.9 






48.4 












(53 


-1/59. 


4) 


(56 


.0/57. 


2) 


(61 


.3/60. 


6) 


(67. 


.3/70. 


0) 


(66. 


.0/73. 


1) 









overall amino acid identifies are indicated in bold-face, and identities of 
nucleotide binding folds 1 and 2 are indicated in parentheses (NBF1/NBF2) 
percent identity was obtained using the GAP command in the GCG package 
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Comparison of the hydropathy profile of MOAT - E with 
other members of the MRP/cMOAT subfamily if shown in 
figure 10. The data reveal that MOAT-E' has a hydrophobic 
N-terminal segment that is present in its closest 
relatives, MOAT-D, MRP and cMOAT . This structural 
feature is present in all of the currently known organic 
anion transporters, and suggests that MOAT-E may share 
substrate specificity with MRP and cMOAT . MOAT-E may 
also share the drug resistance activity of the latter two 
proteins. In contrast, MOAT - B and MOAT-C do not have 
this hydrophobic N-terminal extension. 

Expression Pattern of MOAT-E in Human Tissues. 

In a Northern blot of RNA isolated from various 
tissues, MOAT-E expression is restricted to liver and 
kidney, suggesting that MOAT-E may participate the 
excretion of substances into the urine and bile. See 
Figure 11. This figure also shows that MOAT-E is 
expressed as an ~6 kB transcript. This is in contrast to 
the -2.3 kB transcript that was reported for ara, clearly 
indicating that the fused ara transcript is unique to the 
cell line from which it was isolated, and is not a 
physiological transcript. Together, the isolation of 
MOAT-E and analysis of its sequence and expression 
pattern suggest that it may be involved in cellular 
resistance to drugs and/or the excretion of drugs into 
the urine and bile. 

DISCUSSION 

The present invention discloses additional 
MRP/cMOAT-related transporters which were identified by 
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using a degenerative PCR cloning approach in which the 
conserved amino terminal ATP-binding domain of known 
eukaryotic transporters was targeted. Using this 
approach the complete coding sequences of MOAT - B , MOAT-C, 
MOAT - D and MOAT-E were obtained. MOAT-B is a protein 
whose predicted structure indicates that it is a member 
of the ABC transporter family. Comparison of the MOAT-B 
predicted protein with other transporters reveals that it 
is most closely related to MRP , cMOAT and yeast YCF1, and 
thus extends the number of known full length MRP-related 
transporters. The similarity of MOAT-B to these 
transporters suggest that it shares a similar substrate 
specificity. Transport assays using membrane vesicle 
preparations indicate that MRP is capable of transporting 
diverse organic anions, including glutathione 
S-conjugates such as LTC 4/ oxidized glutathione, and 
glucuronidated and sulfated conjugates of steroid 
hormones and bile salts (7) . Although membrane vesicle 
transport assays of substrate specificity using 
cMOAT-transf ected cells have not yet been reported, 
genetic and biochemical studies using TR- and EHBR rat 
strains, which are defective in the hepatobiliary 
excretion of glutathione and glucuronate conjugates, 
indicate that it is also an ATP-dependent transporter of 
organic anions. cMOAT, which is primarily expressed in 
the canalicular membrane of hepatocytes, has been 
reported to be absent in these rat strains, and 
hepatocyte canalicular membranes prepared from the mutant 
rats are deficient in the ATP-dependent transport of 
glutathione and glucuronate conjugates (23, 24). In 
addition, cMOAT protein has also been reported to be 
absent in the hepatocytes of patients with Dubin- Johnson 
syndrome (25) , a disorder manifested by chronic 



55 



WO 99/49735 PCT/US99/06644 

conjugated hyperbilirubinemia. YCF1, a yeast 
transporter, has also been demonstrated to transport 
glutathione complexes (26) . Thus, based upon the 
similarity of MOAT - B to these three transporters, it is 
possible that it also functions to transport organic 
anions, an activity critical to the cellular 
detoxification of a wide range of xenobiotics, 
MOAT-C, MOAT-D and MOAT - E are three other 
MRP/cMOAT-related transporters. The isolation of these 
two transporters extends the number of known full length 
members of this subfamily to six. Based upon the degree 
of amino acid similarity and overall topology these six 
proteins fall into two groups. The first group is 
composed of MOAT-D, MOAT-E, MRP and cMOAT . These four 
transporters are highly related, sharing -39-45% amino 
acid identity. MOAT-D is more closely related to MRP 
(57% identity) than is cMOAT (48% identity) , and is 
therefore the closest known relative of MRP. In addition 
to a high degree of amino acid identity, the similarity 
between MOAT-D, MRP and cMOAT, also extends to overall 
topology. Like MRP and cMOAT, MOAT-D and MOAT-E have 
three membrane spanning domains, including an N- terminal 
hydrophobic extension that is predicted to harbor -5 
transmembrane helices, and which is absent in 
transporters such as CFTR and MDR1 . This N-terminal 
extension is also present in YCF1 , a related yeast 
transporter that transports glutathione S-conjugates , and 
SUR, a more distantly related transporter involved in the 
regulation of potassium channels. The second group of 
MRP/cMOAT-related transporters is composed of MOAT-B and 
MOAT-C. These two transporters are distinguished from the 
first group by their lower level of amino acid similarity 
and distinct topology. Like MOAT-D and MOAT-E, MOAT-B 
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and MOAT-C are more closely related to MRP (39% and 36%, 
respectively) and cMOAT (37% and 36%, respectively) than 
to other eukaryotic transporters . However, they share 
considerably less similarity with MRP, cMOAT, MOAT-D and 
MOAT-E than the latter four transporters share with each 
other (-39-45% identity) . In addition, in contrast to 
MRP, cMOAT, MOAT-D and MOAT-E, MOAT-B and MOAT-C do not 
have an N- terminal membrane spanning domain, and their 
topology is therefore more similar to many other 
eukaryotic ABC transporters that also have only two 
membrane spanning domains . 

Defining the contributions of MOAT-B, MOAT-C, MOAT-D 
and MOAT-E to cytotoxic drug resistance will facilitate 
the design of novel chemotherapeut ic agents. The 
multidrug resistance activity of MRP is well described. 
While the drug sensitivity pattern of cMOAT- trans fee ted 
cells has not yet been reported, the possibility that it 
may also confer resistance to cytotoxic drugs is 
suggested by a recent report in which transfection of a 
cMOAT antisense vector was found to enhance the 
sensitivity of a human liver cancer cell line to both 
natural product drugs and cisplatin. Since MOAT-D and 
MOAT-E are more closely related to MRP than is cMOAT , the 
possibility that they will also confer resistance is 
particularly intriguing. The availability of the MOAT-B, 
MOAT-C, MOAT-D and MOAT-E cDNAs will facilitate the 
analysis of their possible contributions to cytotoxic 
resistance . 
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While certain of the preferred embodiments of the 
present invention have been described and specifically 
exemplified above, it is not intended that the invention 
be limited to such embodiments. Various modifications 
may be made thereto without departing from the scope and 
spirit of the present invention, as set forth in the 
following claims. 
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What is claimed is: 

1. An isolated nucleic acid molecule having the 
sequence of SEQ ID NO : 1 , said nucleic acid molecule 
comprising a nucleotide sequence encoding a MOAT - B 
transporter protein about 1350 amino acids in length, 
said encoded transporter protein comprising a 
mult i -domain structure including a tandem repeat of 
nucleotide binding folds appended C- terminal to a 
hydrophobic domain, said nucleotide binding folds having 
Walker A and B ATP binding sites, said C- terminal domain 
having a plurality of membrane spanning helices. 

2. The nucleic acid molecule of claim 1, which 

is DNA. 

3. The DNA molecule of claim 2, which is a 
cDNA comprising a sequence approximately 5.9 kilobase 
pairs in length that encodes said MOAT-B ' transporter 
protein. 

4. The DNA molecule of claim 2, which is a 
gene comprising introns and exons, the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 1, and said exons encoding said MOAT-B transporter 
protein. 

5. An isolated RNA molecule transcribed from 
the nucleic acid of claim 1 . 

6. The nucleic acid molecule of claim 1, 
wherein said sequence encodes a MOAT-B transporter 
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protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO 2 and amino acid sequences 
encoded by natural allelic variants of said sequence. 

7. The nucleic acid molecule of claim 6, which 
comprises SEQ ID NO 1. 

8. An antibody immunologically specific for 
the protein encoded by the nucleic acid of claim 1. 

9. An antibody as claimed in claim 8, said 
antibody being monoclonal . 

10. An antibody as claimed in claim 8, said 
antibody being polyclonal . 

11. An isolated nucleic acid molecule having 
the sequence of SEQ ID NO: 3, said nucleic acid molecule 
comprising a sequence encoding a MOAT-C transporter 
protein about 1450 amino acids in length, said 
transporter protein having a multi -domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding foldes having Walker A and B 
binding sites, and a C- terminal hydrophobic domain that 
contains several membrane spanning helices. 

12. The nucleic acid molecule of claim 11, which is 

DNA. 

13. The DNA molecule of claim 12, which is a cDNA 
comprising a sequence approximately 6.6 kilobase pairs in 
length that encodes said MOAT-C transporter protein. 
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14. The DNA molecule of claim 12, which is a gene 
comprising introns and exons , the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 3, and said exons encoding said MOAT-C transporter 
protein . 

15. An isolated RNA molecule transcribed from the 
nucleic acid of claim 11. 



16. The nucleic acid molecule of claim 11, wherein 
said sequence encodes a MOAT-C transporter protein having 
an amino acid sequence selected from the group consisting 
of SEQ ID NO 4 and amino acid sequences encoded by 
natural allelic variants of said sequence. 

17. The nucleic acid molecule of claim 11, which 
comprises SEQ ID NO 3. 

18. An antibody immunologically specific for the 
protein encoded by the nucleic acid of claim 11. 

19. An antibody as claimed in claim 18, said 
antibody being monoclonal. 

20. An antibody as claimed in claim 18, said 
antibody being polyclonal. 

21. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding amino acids of SEQ ID NO 4 . 
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22. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding amino acids of SEQ ID NO 2 . 

23. An isolated nucleic acid molecule having the 
sequence of SEQ ID NO: 5, said nucleic acid molecule 
comprising a sequence encoding a MOAT - D transporter 
protein about 1550 amino acids in length, said 
transporter protein having a mult i -domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding foldes having Walker A and B 
binding sites, and a C- terminal hydrophobic domain that 
contains several membrane spanning helices. 



24. The nucleic acid molecule of claim 23, which is 

DNA. 

25. The DNA molecule of claim 24, which is a cDNA 
comprising a sequence approximately 6 kilobase pairs in 
length that encodes said MOAT-D transporter protein. 

26. The DNA molecule of claim 24, which is a gene 
comprising introns and exons, the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 5, and said exons encoding said MOAT-D transporter 
protein. 

27. An isolated RNA molecule transcribed from the 
nucleic acid of claim 23 . 

28. The nucleic acid molecule of claim 23, wherein 
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said sequence encodes a MOAT - D transporter protein having 
an amino acid sequence selected from the group consisting 
of SEQ ID NO 6 and amino acid sequences encoded by 
natural allelic variants of said sequence. 

29. The nucleic acid molecule of claim 23, which 
comprises SEQ ID NO 5. 

30. An antibody immunologically specific for the 
protein encoded by the nucleic acid of claim 23. 

31. An antibody as claimed in claim 30, said 
antibody being monoclonal . 

32. An antibody as claimed in claim 30, said 
antibody being polyclonal . 

33. An oligonucleotide between about 10 and about 
200 nucleotides in length, which specifically hybridizes 
with a protein translation initiation site in a 
nucleotide sequence encoding amino acids of SEQ ID NO 6 . 

34. An isolated nucleic acid molecule having the 
sequence of SEQ ID NO: 7, said nucleic acid molecule 
comprising a nucleotide sequence encoding a MOAT - E 
transporter protein about 1503 amino acids in length, 
said transporter protein having a multi-domain structure 
including a tandem repeat of nucleotide binding folds, 
said nucleotide binding folds having Walker A and B 
binding sites, and a C- terminal hydrophobic domain that 
contains several membrane spanning helices, 

35. The nucleic acid molecule of claim 34, 
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36. The DNA molecule of claim 35, which is a 
cDNA comprising a sequence approximately 6 kilobase pairs 
in length that encodes said MOAT-E transporter protein. 

37. The DNA molecule of claim 35, which is a 
gene comprising introns and exons, the exons of said gene 
specifically hybridizing with the nucleic acid of SEQ ID 
NO 7 , and said exons encoding said MOAT-E transporter 
protein . 

38. An isolated RNA molecule transcribed from 
the nucleic acid of claim 34. 

39. The nucleic acid molecule of claim 34, 
wherein said sequence encodes a MOAT-E transporter 
protein having an amino acid sequence selected from the 
group consisting of SEQ ID NO 8 and amino acid sequences 
encoded by natural allelic variants of said sequence. 

40. The nucleic acid molecule of claim 39, 
which comprises SEQ ID NO 7 . 

41. An antibody immunologically specific for 
the protein encoded by the nucleic acid of claim 34. 

42. An antibody as claimed in claim 41, said 
antibody being monoclonal . 

43. An antibody as claimed in claim 41, said 
antibody being polyclonal. 
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44. An oligonucleotide between about 10 and 
about 200 nucleotides in length, which specifically 
hybridizes with a protein translation initiation site in 
a nucleotide sequence encoding amino acids of SEQ ID NO 

7 . 

45. A plasmid comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NO : 1, SEQ 
ID NO: 3, SEQ ID NO: 5 and SEQ ID NO : 7 . 

46. A vector comprising a nucleotide sequence 
selected from the group consisting of SEQ ID NO: 1, SEQ 
ID NO: 3, SEQ ID NO: 5 and SEQ ID NO : 7 . 

47. A retroviral vector comprising a 
nucleotide sequence selected from the group consisting of 
SEQ ID NO: 1, SEQ ID NO : 3, SEQ ID NO: 5 and SEQ ID NO : 7 . 

48. A host cell comprising at least one 
nucleic acid molecule having a sequence selected from the 
group consisting of SEQ ID NO: 1, SEQ ID NO: 3, SEQ ID 
NO: 5 and SEQ ID NO : 7 . 

49. A host cell as claimed in claim 48, 
wherein said host cell is selected from the group 
consisting of bacterial, fungal, mammalian, insect and 
plant cells . 

50. A host cell as claimed in claim 48, 
wherein said nucleic acid is provided in a plasmid and is 
operably linked to mammalian regulatory elements which 
confer high expression and stability of mRNA transcribed 
from said nucleic acid. 
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51. A host cell as claimed in claim 48, 
wherein said nucleic acid is provided in a plasmid and is 
operably linked to mammalian regulatory control elements 
in reverse anti-sense orientation. 

52 . A host animal comprising at least one 
nucleic acid molecule selected from the group consisting 
of SEQ ID NO: 1, SEQ ID NO : 3, SEQ ID NO : 5 and SEQ ID 
NO: 7 . 

53. A host animal as claimed in claim 52, 
wherein said animal harbors a homozygous null mutation in 
its endogenous MOAT gene wherein said mutation has been 
introduced into said mouse or an ancestor of said mouse 
via homologous recombination in embryonic stem cells, and 
further wherein said mouse does not express a functional 
mouse MOAT protein. 

54. The transgenic mouse of claim 53, wherein 
said mouse is fertile and transmits said null mutation to 
its offspring. 

55. The transgenic mouse of claim 53, wherein 
said null mutation has been introduced into an ancestor 
of said mouse at an embryonic stage following 
microinjection of embryonic stem cells into a mouse 
blastocyt . 

56. A method for screening a test compound for 
inhibition of MOAT mediated transport, comprising: 

a) providing a host cell expressing at least 
one MOAT-encoding nucleic acid having a sequence selected 
from the group consisting of SEQ ID NOS : 1, 3, 5, and 7; 



70 

SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 PCT/US99/06644 

b) contacting said host cell with a compound 
suspected of inhibiting MOAT-mediated transporter 
activity; and 

c) assessing inhibition of transport mediated 
by said compound. 

57. A method as claimed in claim 56, wherein 
inhibition of MOAT mediated transport is indicated by 
restoration of anticancer drug sensitivity. 

58. A method as claimed in claim 57, wherein 
said inhibition of MOAT mediated transport is indicated 
by a reduction of transporter mediated cellular efflux of 
anticancer agents . 



59. A kit for detecting the presence of MOAT 
encoding nucleic acids in a sample, comprising: 

a) oligonucleotide primers specific for 
amplification of MOAT encoding nucleic acids; 

b) polymerase enzyme; 

c) amplification buffer; and 

d) MOAT specific DNA for use as a 
positive control. 
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841 VPWSYYGVYI QAAGGPLAFL VIMALFKLNV GSTAPSTWWL SYHIKQGSGH TTVTRGNETS 

TM8 

901 VSDSKKDNPB MQYTASIZAL SMAVMLIU* IRGWFVKGT iWsRLHDE LFRRILRSPM 

_ T*M9 

961 «-FDTTPTG R 1LNRFSKDMD EVDVRLPFQ& EMFIQNVILV FFCVGKLAGV FFWFLVAVGP 

1021 L.VTLFSYLHI VSRVLIRELK W£KITQSPP LSHITSSIQG LATIHAYNKG QEFLHRTQEL 

1081 LDDHQAPFFL FTCAMRWIAV RLDLISIALI TTTGLKIVLM HGQIPPAT AG LAISTAVQLT 

1141 GLPQFTVRIA SETEARFTSV ERINHYT JCTL SLEAPARIKN KAPSPDWPQE GEVTFEHAEM 
f*"NBF2 

1201 RYREHLPLVT. KKVSFT1KPK EKIGITV GRTG SGKSS LGMAI, FRXVELSGGC IKIDGVRISD 

1261 IGLADLRSKL SIIPQEPVI^ SGTVRSNLOP FNQYTEDQIW DALERTHMKE CIAQLPLKLE 

1321 SEVMENGDNF SVGJ* QLLC I ARALLRHC P LILD EATAAM DtS E TIREAFADCT 

c B 
1381 KLTIAHRLHT VLGSDRIKVI. AQGQWEFDT PSVLLSNDSS RFYAMFAAAE H1CVAVKG 
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Fig. 5B 



1 MGPMDALCGS GELGSKFWDS NLSVHTENPD LTPCF QNSLL AWVPCIYLWV ALPCYLL YLR 
HHCRGYULS HLSKLKHVLG VLLWCVSWAD LFYSFHGLVH GRAP APVFFV T P L WGVTHL 
LATLLIQYER LQGV QSM.VL IIFWFLCWC AIVPFRSKIL LAKAEGEISD PFrF^fTFh" 
181 FALVLSAI.IL ACFREKPPFF sakkvdpnpy PETSVGFLSR LFFWWF"tKMA iygyrhplee 
KDLWSLK EED RSQMVV< ^ L «WRKQ TO ARHKASAAPG KNASGEDEVL LGARPRPRKP 
301 SFMa »* W ^ SKLISACF KLIQDLLSFI NPQLLSILIR FISNPMAPSW WG FLVAG LMF 
3d lcsmm qsi.il qhyteyifvt gvkfrtg img VIYRKALVIT HSVKRASTVG EXVHLHSVDA 

«1 0»™>lWl MLLWSAPLQI ILAIYFllwQK LGPb'VLAGVA FMVTLLI PLNG AVA VKMRAFQ 
481 VKQMKLKDSR^I KLM SE I LNG IKVLKLYAKE PSFLKQVEGI RQGELQLLRT AAYLHTTTTF 
54! TWMCSPFLVT LITLWVYVYV DPNNVLDAEK AFVSVSLFNI LrLplhMLPQ L ISMLTQASV 
601 SLKRIQQFLS QEELDPQSVE RKTISPGYAI TIHSGTFTWA ODLPPtS D IQ VPKGAI.V 
661 AWGPVG^SSLVSALLGE MEKLEGKVHM KGSVAYYPQQ AWIQNCTLQE NVLFGKALNP 
721 KRYQQTLEAC^ADLEMLP GGDQTEIGEK GINLSGGQRQ RVS LARAVYS DADIFLLDDP 
781 LSAVDSHVAK HIFDEVIGPE GVLAGKTRVL VTHGISFXPQ TDFIIVLADG QVSEMGPYPA 
841 LLQRNGSFAK FLCHYAPDED QGHLEDSWTA LEGAEDKEAL LIEDTLSNHT DLTDNDPVTY 
901 SAI^SDGEGQ GRPVPRRHLG PSEKVQVTEA KADGAXTQEE KAAIGTVEXS 

9 " ^A^LcrTIAICL L YVGQSAAAIG AKVWLSAWTN DAKADSRQKH TSLRLGVHA 
102! ^ILQGFLVH LAAMAMAAGC IQAARVLHQ A LLHHKIRSPQ SFFDTTPSGR ILNCFSKDIY 
1081 VVDEVIAPV . i^SFFKA ISTLWXMAS TP LFTVVILP ^LlVQR FYAA TSRQLK 
H41 PXESVSRSP^YSHFSETVTG ASVIRAYNRS RDFETISDTK VDAHQRSCYP TIISHRWLSI 
"CI O-VEFVGNCW LFAALFAVTG RSSLNPGLVG LSVSY^VT FALNWMIRMM SDLESKXVAV 
1261 ERVKEYSKTE TEAPWWEGS RP PEGWPPRG EVEFRHISVR TRPGLDL^R DLSLHVHGGE 
1321 KVGI VGRTGA^ GKS SKTLCLF RILEAAKGEI RIDGLNVADI GLHDLRSQLT IIPQDPILFS 
1361 GTLRKNLDPF GSYSEEDIWW ALELSHLHTF VSSQPAGLDF OCSEGGENLS VGORQLVCLA 
1441 RAXJ.RKS R IL_VLD EAT AA I D LETDNLIQAT IRTQFDTCTV LTIAHRLNTI MDYTRVLVLD 
1501 KGWAEFDSP AKLIAARGIF YG HARD AG LA 
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Nucleotide Binding Fold I 
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1 MAAPAEPCAG QGVWNQTEPE PAATSLLSLC FLRT AGV>WVP PMVLWvLGPI YLtFIH HHGR 

61 GYLRMSPLFK AKMVLGFALI VLCTSSVAVA LWKIQQGTPE APEFLIHPTV WLTTMSFAVF 

121 LI HTERKKGV OSSGVLFGYW LLCFVLPATN AAOOASGAGF QSDPVRHLST YLCLSLWAQ 

181 FVLSCLADQP PFFPEDPOQS NPCPETGAAF PSKATFWWVS GLVWRGYRRP LRPKDLWSLG 

241 RENSSEELVS RLEKEWMRNR SAARRHNKAI AFKRKGGSGM KAPETEPFLR QEGSQWRPLL 

301 KAIWOVFHST FLLGTLSLII SDVFRFTVPK LLSLFLEF1G DPKPPAWKGY LLAVLMFLSA 



361 CLQTLFEQQN MYRLKVPQMR LRSAITGLVY RKVLALSSGS RKASAVGDW NLVSVDVQRL 
4 21 TESVLYLNGL WLPLVWIWC FVYLWQLLGP S ALTAI AVFL SLLPLNFFIS KKRNHHQEEQ 
481 MRQKDSRARL TSSILRNSKT IKFHGWEGAF LDRVLGIRGQ ELGALRTSGL LFSVSLVSFG 



541 VSTFLVALW FAVHTLVAEN AMNAEKAFVT LTVLNILNKA QAFLPFSIHS LVQARVSFDR 
601 LVTFLCLEEV DPGWDSSSS GSAAGKDCIT IHSATFAWSQ ESPPCLHRIN LTVPQGCLLA 
661 WGPVGAGKS SLLSALLGEL SKVEGFVSIE GAVAYVPQEA WVQNTSWEK VCFGQELDPP 

A 

721 WLERVLEACA LQPDVDSFPE GIHTSIGEQG M KLSGGQ KQR LSLARAVYRK AAVYLLDDPL 

NBF1-*-j C B 

781 AALDAHVGQH VFNQVIGPGG LLQGTTRILV THALHILPQA DWIIVLANGA IAEMGSYQEL 

841 LQRKGALVCL LDQARQPGDR GEGETEPGTS TKDPRGTSAG RRPELRRERS IKSVPEKDRT 

901 TSEAQTEVPL DDPDRAGWPA GKDSIQYGRV KATVHLAYLR AVGTPLCLYA LPLFLCQOVA 

961 SFCRGYWLSL WADDPAVGGQ QTQAALRGGI FGLLGCLQAI GLFASMAAVL LGGARASRLL 

1021 FQRLLWDWR SPISFFBRTP IGHLLNRFSK ETDTVDVDIP DKLRSLLMYA FGLLEVSLW 



1081 AVATPIATVA ILPLFLLYAG FQSLYWSSC QLRRLESASY SSVCSHHAET FOGSTWRAF 
1141 RTQAPFVAQN NARVDESQRI SFPRLVADRW LAANVELLGN GLVFAAATCA VLSKAHLSTiG 



1201 LVGFSVSAAI, QVTQALQWW RNWTDLENSI VSVERMQDYA WTPKEAPWRL PTCAAQPPWP 

C-NBF2 

1261 QGGQI EFRDF GLRYRPELPL AVQGVSLKI H AGEKVGIV GR TGAGKS SLAS GLLRLQEAAE 

A 

1321 GGIWIDGVPI AHVGLHTLRS RISIIFQDPI LFPGSLRMNL DLLQEHSDEA IWAALETVQL 

NBF2-* — 

1381 KALVASLPGQ ZjQY KCAO RG E DL SVGQKQ LL CLARALLRKT Q ILILD EATA AVDPGTELQM 

C B 

1441 QAMLGSWFAQ CTVLLIAHRL RSVMDCARVL VMDKGQVAES GSPAQLLAQK GLFYRLAQES 
1501 GLV 
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MOAT B cDNA AND AMINO ACED SEQUENCE ENCODED THEREBY 



ATGCTGCCCGTGTACCAGGAGGTGAAGCCCAACCCGCTGCAGGACGCGAACATCTGCTCA 

1 + - — + + + + — - + 60 

TACGACGGGCACATGGTCCTCCACTTCGGGTTGGGCGACGTCCTGCGCTTGTAGACGAGT 

a MLPVYQEVKPNPLGPANICS - 

CGCGTGTTCTTCTGGTGGCTCAATCCCTTGTTTAAAATTGGCCATAAACGGAGATTAGAG 

6 1 + + + + + + 120 

G CG C AC AAG AAG ACCACCG AGTT AG G G A AC A A ATTTT A ACCG GTATTTG CCTCTAATCTC 

a RVFFWWLNPLFKIGHKRRLE - 

GAAGATGATATGTATTCAGTGCTGCCAGAAGACCGCTCACAGCACCTTGGAGAGGAGTTG 

121 + + + + + 4- 180 

CTTCTACT ATACATAAGTC ACG ACGGTCTTCTG GCG AGTGTCGTG G AACCTCTCCTC AAC 

a EDDMYSVLPEDRSQHLGEEL- 

C AAG G G TTCTG G G AT AAAG AAG TTTT A AG AG CTG AG A ATG ACG C A C AG A AG CCTTCTTTA 

181 + + + + + + 240 

GTTCCCAAGACCCTATTTCTTCAAAATTCTCGACTCTTACTGCGTGTCTTCGGAAGAAAT 

a QGFWDKEVLRAENDAQKPSL - 

ACAAG AG C AATC ATAAAGTGTTACTG G AAATCTT ATTTAGTTTTG G G AA I I I I I ACGTTA 

241 + + + + + + 300 

TGTTCTCGTTAGTATTTCACAATGACCTTTAGAATAAATCAAAACCCTTAAAAATGCAAT 

a TRAIIKCYWKSYLVLGIFTL- 

ATTG AG G AAAG TGCCAAAGTAATCC AG CCCATATTTTTGGGAAAAATT ATT AATT ATTTT 

301 + 4- 4- + 4- + 360 

TAACTCCI 1 ICACGGTTTCATTAGGTCGGGTATAAAAACCCTTTTTAATAATTAATAAAA 

a 1EESAKVIQPIFLGKIINYF- 

GAAAATTATGATCCCATGGATTCTGTGGCTTTGAACACAGCGTACGCCTATGCCACGGTG 
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361 + + + + + + 420 

CTTTTAATACTAGGGTACCTAAGACACCGAAACTTGTGTCGCATGCGGATACGGTGCCAC 

ENYDPMDSVALNTAYAYATV - 

CTGACi I I I I GCACGCTCATTTTG G CTATACTGCATCACTTATATTTTTATC ACGTTCAG 
421 + + + + - + + 480 

GACTGAAAAACGTGCGAGTAAAACCGATATGACGTAGTGAATATAAAAATAGTGCAAGTC 
LTFCTLILAILHHLYFYHVQ - 

TGTGCTGGGATGAGGTTACGAGTAGCCATGTGCCATATGATTTATCGGAAGGCACTTCGT 
481 + + + + + + 540 

ACACGACCCTACTCCAATGCTCATCGGTACACGGTATACTAAATAGCCTTCCGTGAAGCA 
CAGMRLRVAMCHM1YRKALR - 

CTTAGTAAC ATG GCC ATG G G G A AG AC AACC AC AG G CCAG ATAGTC AATCTG CTGTCC AAT 
541 + + + + + + 600 

GAATCATTGTACCGGTACCCCTTCTGTTGGTGTCCG6TCTATCAGTTAGACGACAGGTTA 
LSNMAMGKTTTGQIVNLLSN - 

G ATGTG AAC AAG TTTG ATC AG GTG AC AGTGTTCTT AC ACTTC CTG TG G G C AG G ACC ACTG 

601 + + + + + + 660 

CTAC ACTTGTTCAAACTAGTCC ACTGTC ACAAG AATGTG AAG G AC ACCCGTCCT G GTG AC 

DVNKFDQVTVFLHFLWAGPL - 

C AG G CG ATCGC AGTG ACTG CCCT ACTCTG G ATGG AG AT AG G A AT ATCG TGCCTTG CTG G G 

661 + + + + + + 720 

GTCCGCTAGCGTCACTGACGGGATGAGACCTACCTCTATCCTTATAGCACGGAACGACCC 

QAIAVTALLWMEIG ISCLAG - 

ATG G CAGTTCT AATC ATTCT CCTG CCCTTG C AAAG CTGTTTTG G G AAGTTG TTCTC ATC A 

721 + + + + + -f 780 

T ACCG TC AAG ATT AG T AAG AG G ACG G G A ACG TTTCG AC A A A ACCCTTC A AC A AG AG TAG T 

MAVLIILLPLQSCFGKLFSS - 

CTGAGGAGTAAAACTGCAACTTTCACGGATGCCAGGATCAGGACCATGAATGAAGTTATA 
781 + + + + + 840 
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GACTCCTCATTTTGACG-rTGAAAGTGCCTACGGTCCTAGTCCTGGTACTTACTTCAATAT 
a LRSKTATFTDARIRTMNEVI - 

ACTGGTATAAGG ATAATAAAAATGTACGCCTGGGAAAAGTCATTTTCAAATCTTATTACC 
841 + " + + + + 900 

TGACCATATTCCTATTATTTTTACATGCGG ACCCTTTTCAGTAAAAGTTTAGAATAATGG 
a TGIRIIKMYAWEKSFSNLIT - 

AAT TTGAGAAAGAAGGAGATTTCCAAGATTCTGAGAAGTTCCTGCCTCAGGGGGATGAAT 
901 + + + + + + 96Q 

TTAAACTCTTTCTTCCTCTAAAGGTTCTAAGACTCTTCAAGGACGGAGTCCCCCTACTTA 

a NLRKKEISKILRSSCLRGMN - 

TTGGCrrCGTTTTTCAGTGCAAGCAAAATCATCGTGTTTGTGACCTTCACCACCTACGTG 
961 + + + + + + 102Q 

AACCGAAGCAAAAAGTCACGTTCGTTTTAGTAGCACAAACACTGGAAGTGGTGGATGCAC 

a LASFFSASK1IVFVTFTTYV - 

CTCCTCGGCAGTGTGATCACAGCCAGCCGCGTGTTCGTGGCAGTGACGCTGTATGGGGCT 
1021 + + + + + + 1080 

GAG G AG CCGTC AC ACTAGTGTCG GTCGG CG C ACAAG C ACCGTC ACTG CG AC ATACCCCG A 
a LLGSVITASRVFVAVTLYGA - 

GTGCGGCTGACGGTTACCCTCTTCTTCCCCTCAGCCATTGAGAG6GTGTCAGAGGCAATC 
1081 + + + + + + 1140 

CACGCCGACTGCCAATGGGAGAAGAAGGGGAGTCGGTAACTCTCCCACAGTCTCCGTTAG 
a VRLTVTLFFPSAIERVSEAI - 

GTCAGCATCCGAAGAATCCAGACCTTTTTGCTACTTGATGAGATATCACAGCGCAACCGT 
1141 + + + + + + 120Q 

CAGTCGTAGGCTTCTTAGGTCTGGAAAAACGATGAACTACTCTATAGTGTCGCGTTGGCA 

a VSIRRIQTFLLLDEISQRNR- 

CAGCTGCCGTCAGATGGTAAAAAGATGGTGCATGTGCAGGATTTTACTGCTTTTTGGGAT 
1201 + + + + + + 126Q 

GTCGACGGCAGTCTACCATTTTTCTACCACGTACACGTCCTAAAATGACGAAAAACCCTA 
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AAGGCATCAGAGACCCCAACTCTACAAGGCCTTTCCTTTACTGTCAGACCTGGCGAATTG 

j1 + + + + + + 1320 

TTCCGTAGTCTCTGGGGTTGAGATGTTCCGGAAAGGAAATGACAGTCTGGACCGCTTAAC 



a KASETPTLQGLSFTVRPGEL - 

TTAGCTGTGGTCGGCCCCGTGGGAGCAGGGAAGTCATCACTGTTAAGTGCCGTGCTCGGG 
1321 + + + + + + 1380 

AATCGACACCAGCCGGGGCACCCTCGTCCCTTCAGTAGTGACAATTCACGGCACGAGCCC 
a LAVVGPVGAGKSSLLSAVLG - 

GAATTGGCCCCAAGTCACGGGCTGGTCAGCGTGCATGGAAGAATTGCCTATGTGTCTCAG 
1381 + + + + + + 144Q 

CTTAACCGGGGTTCAGTGCCCGACCAGTCGCACGTACCTTCTTAACGGATACACAGAGTC 
a ELAPSHGLVSVHGRIAYVSQ - 

CAGCCCTGGGTGTTCTCGGGAACTCTGAGGAGTAATATTTTATTTGGGAAGAAATATGAA 
1441 + + + + + _ + 150Q 

GTCGGGACCCACAAGAGCCCTTGAGACTCCTCATTATAAAATAAACCCTTCrnATACTT 
a QPWVFSGTLRSNILFGKKYE - 

AAGGAACGATATGAAAAAGTCATAAAGGCTTGTGCTCTGAAAAAGGATTTACAGCTGTTG 
1501 + + + + + + 1560 

TTCCTTGCTATAl.. . I I iCAGTATTTCCGAACACGAGACTTTTTCCTAAATGTCGACAAC 
a KERYEKVIKACALKKDLQLL - 

GAGGATGGTGATCTGACTGTGATAGGAGATCGGGGAACCACGCTGAGTGGAGGGCAGAAA 
1561 + + + + + + 1620 

CTCCTACCACTAGACTGACACTATCCTCTAGCCCCTTGGTGCGACTCACCTCCCGTCTTT 



a EDGDLTVIGDRGTTLS 



G G Q K 



GCACGGGTAAACCTTGCAAGAGCAGTGTATCAAGATGCTGACATCTATCTCCTGGACGAT 
1621 + + + + + + 168Q 

CGTGCCCATTTGGAACGTTCTCGTCACATAGTTCTACGACTGTAGATAGAGGACCTGCTA 
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a ARVNLARAVYQDADIYLL D D - 

CCTCTCAGTGCAGTAGATGCGGAAGTTAGCAGACACTTGTTCGAACTGTGTATTTGTCAA 
1681 -— + - + + + + + 1740 

GGAGAGTCACGTCATCTACGCCTTCAATCGTCTGTGAACAAGCTTGACACATAAACAGTT 
a PLSAVDAEVSRHLFELCICQ - 

ATTTTGCATGAGAAGATCACAATTTTAGTGACTCATCAGTTGCAGTACCTCAAAGCTGCA 
1741 + + + + + + 1 800 

T A AA ACG T ACTCTTCT AGTGTT A A A ATC ACTG AG TAG TC A ACG TC ATGG AGTTTCG ACG T 
a ILHEKITILVTHQLQYLKAA - 

AGTCAGATTCTGATATTGAAAGATGGTAAAATGGTGCAGAAGGGGACTTACACTGAGTTC 
1801 + + + + + + i860 

TCAGTCTAAGACTATAACTTTCTACCATTTTACCACGTCTTCCCCTGAATGTGACTCAAG 
a SQILILKDGKMVGKGTYTEF - 

CTAAAATCTG GT ATAG ATTTTG G CTCCCTTTT A AAG AAG G ATA ATG AG G A AAG TG A AC A A 

1861 + + + + + + 1920 

GATTTTAGACCATATCTAAAACCGAGGGAAAAT^ 

a LKSGIDFGSLLKKDNEESEQ - 

CCTCCAGTTCC^GGAACTCCCACACTAAGGAATCGTACCTTCTCAGAGTCTTCGGTTTGG 
1921 + + + + + + 1980 

G G AG GTC AAGGTCCTTG AG G GTG TG ATTCCTTAGC ATG G AAG AG TCTCAG AAG CC AAACC 
a PPVPGTPTLRNRTFSESSVW - 

TCTCAACAATCTTCTAGACCCTCCTTGAAAGATGGTGCTCTGGAGAGCCAAGATACAGAG 
1981 + + + + + + 2040 

AGAGTTGTTAGAAGATCTGGGAGGAACTTTCTACCACGAGACCTCTCGGTTCTATGTCTC 
a SQQSSRPSLKDGALESQDTE - 

AATGTCCCAGTTACACTATCAGAGGAGAACCGTTCTGAAGGAAAAGTTGGTTTTCAGGCC 
2041 + — ■ + + + + + 2100 

TTACAGGGTCAATGTGATAGTCTCCTCTTGGCAAGACTTCCTTTTCAACCAAAAGTCCGG 
a NVPVTLSEENRSEGKVGFQA 
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TATAAG A ATT A CTTC AG AG CTG G TG CTC ACTGG ATTG TCTTC ATTTTCCTT ATTCTCC T A 
2101 + --- + + + — + + 2160 

ATATTCTTAATGAAGTCTCGACCACGAGTGACCTAACAGAAGTAAAAGGAATAAGAGGAT 
YKNYFRAGAHWI VFIFLILL 

AACACTGCAGCTCAGGTTGCCTATGTGCTTCAAGATTGGTGGCTTTCATACTGGGCAAAC 
2161 + + + + + + 2220 

TTGTGACGTCGAGTCCAACGGATACACGAAGTTCTAACCACCGAAAGTATGACCCGTTTG 
NTAAQVAYVLQDWWLSYWAN - 

A AAC AAAG T ATG CTA AATGTC ACTG T A AATG G AGG AG G AA ATG TA AC CG AG A AG CT AG AT 
2221 + + + + + + 2280 

TTTGTTTCATACGATTTACAGTGACATTTACCTCCTCCTTTACATTGGCTCTTCGATCTA 
KGSMLNVTVNGGGNVTEKLD - 

CTT AA CTG G T ACTTAG G AATTT ATTC AG GTTT AACTG TAG CT ACCG TTCTTTTTG G CAT A 

2281 + + + + + + 2340 

G AATTG ACC ATG AATCCTTAAAT AAGTCC AAATTG AC ATCG ATG G C A AG AA A AACCG TAT 

LNWYLGIYSGLTVATVLFGI - 

GCAAGATCTCTATTGGTATTCTACGTCCrrGTTAACTCTTCACAAACTTTGCACAACAAA 
2341 + + «f + + + 2400 

CGTTCTAG AG ATAACCAT AAG ATG C AG G AAC AATTG AG AAGTG TTTG AAACGTGTTGTTT 
ARSLLVFYVLVNSSQTLHNK - 

ATGTTTG AGTC AATTCTG AAAG CTCCG GTATTATTCTTTG AT AG A A ATCC AATAG G AAG A 
2401 + + + + + + 2460 

TACAAACTCAGTTAAGACTTTCGAGGCCATAATAAGAAACTATCTTTAGGTTATCCTTCT 
MFESILKAPVLFFDRNPIGR - 

ATTTT AAATCG TTTCTCC AAAG AC ATTG G AC ACTTG G ATG ATTTG CTG CCG CTG A CG TTT 
2461 + + + + + + 2520 

TAAAATTTAGCAAAGAGGTTTCTGTAACCTGTGAACCTACTAAACGACGGCGACTGCAAA 
ILNRFSKOIGHLDDLLPLTF 
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TTAGATTTCATCCAGACATTGCTACAAGTGGTTGGTGTGGTCTCTGTGGCTGTGGCCGTG 
2521 + + + + + + 2580 

AATCTAAAGTAGGTCTGTAACGATGTTCACCAACCACACCAGAGACACCGACACCGGCAC 



i LDFIQTLLQVVGVVSV 



A V A V - 



ATTCCTTGGATCGCAATACCCTTGGTTCCCCTTGGAATCATTTTCATTTTTCTTCGGCGA 
2581 + - _.. + + + + ._ + 264Q 

TAAGGAACCTAGCGTTATGGGAACCAAGGGGAACCTTAGTAAAAGTAAAAAGAAGCCGCT 

IPWIAIPl VPLGIIFIFLRR - 

TATTTTTTGGAAACGTCAAGAGATGTGAAGCGCCTGGAATCTACAACTCGGAGTCCAGTG 
2641 + + + + + + 270Q 

ATAAAAAACCTTTGCAGTTCTCTACACTTCGCGGACCTTAGATGTTGAGCCTCAGGTCAC 
YFLETSRDVKRLESTTRSPV - 

TTTTCCCACTTGTCATCTTCTCTCCAGGGGCTCTGGACCATCCGGGCATACAAAGCAGAA 
2701 + + + + + + 276Q 

AAAAGGGTGAACAGTAGAAGAGAGGTCCCCGAGACCTGGTAGGCCCGTATGTTTCGTCTT 
FSHLSSSLQGLWTIRAYKAE - 

GAGAGGTGTCAGGAACTGTTTGATGCACACCAGGATTTACATTCAGAGGCTTGGTTCTTG 
2761 + + + + + + 2820 

CTCTCCACAGTCCTTGACAAACTACGTGTGGTCCTAAATGTAAGTCTCCGAACCAAGAAC 
ERCQELFOAHQDLHSEAWFL - 

TTTTTGACAACGTCCCGCTGGTTCGCCGTCCGTCTGGATGCCATCTGTGCCATGTTTGTC 
2821^ + + + + + + 28eo 

AAAAACTGrrGCAGGGCGACCAAGCGGCAGGCAGACCTACGGTAGACACGGTACAAACAG 

FLTTSRWFAVRLDAICAMFV - 

ATCATCGTTGCCTTTGGGTCCCTGATTCTGGCAAAAACTCTGGATGCCGGGCAGGTTGGT 
2881 + + + + + + 294Q 

TAGTAGCAACGGAAACCCAGGGACTAAGACCGTTTTTGAGACCTACGGCCCGTCCAACCA 
IIVAFGSLILAKTLDAGQVG - 

TTG GC ACTGTCCTATG CCCTC ACG CTC ATGG G G ATG TTT C AGTG G TG TG TTCG AC AAAG T 
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294 1 " + — " « + + - + + 3000 

AACCGTGACAGGATACGGGAGTGCGAGTACCCCTACAAAGTCACCACACAAGCTGTTTCA 

a LALSYALTLMGMFQWCVRQS 

GCTGAAGTTGAGAATATG ATGATCTCAGTAG AAAGGGTCATTGAATACACAGACCTTGAA 
3001 + " + + + + - + 3060 

CGACTTCAACTCTTATACTACTAG AGTCATCTTTCCCAGTAACTTATGTGTCTGGAACTT 
a AEVENMMISVERVIEYTDLE - 

AAAGAAGCACCTTGGGAATATCAGAAACGCCCACCACCAGCCTGGCCCCATGAAGGAGTG 
3061 + + + + i + + 3T20 

TTTCTTCGTGGAACCCTTATAGTCTTTGCGGGTGGTGGTCGGACCGGGGTACTTCCTCAC 
a KEAPWEYQKRPPPAWPHEG V - 



ATAATCTTTGACAATGTGAACTTCATGTACAGTCCAGGTGGGCCTCTGGTACTGAAGCAT 
M + + + + + + 3180 

TATTAGAAACTGTTACACTTGAAGTACATGTCAGGTCCACCCGGAGACCATGACTTCGTA 



a IIFDNVNFMYSPGGPLVLKH - 

CTGACAGCACTCATTAAATCACAAGAAAAGGTTGGCATTGTGGGAAGAACCGGAGCTGGA 
3181 + + + + + + 3240 

GACTGTCGTGAGTAATTTAGTGTTCTTTTCCAACCGTAACACCCTTCTTGGCCTCGACCT 
a LTALIKSQEKVG1VGRTGAG - 

^A^^GTTCCCTC ATCTCAG CCCTTTTTAG ATTGTC AG AACCCG AAG GT AAAATTTG G ATT 
3241 + + + + + 3300 

TCAAG G G AG TAG AGTCG G G AAAAATCTAAC AGTCTTG G G CTTCC ATTTT AAACCT AA 



a KSSLISALFRLSEPEGKIWI 



GATAAGATCTTGACAACTGAAATTGGACTTCACGATTTAAGGAAGAAAATGTCAATCATA 
3301 + + + + + + 336O 

CTATTCTAGAACTGTTGACTTTAACCTGAAGTGCTAAATT^ 



a DKILTTEIGLHDLRKKMS 



CCTCAGGAACCTGTT7TGTTCACTGGAACAATGAGGAAAAACCTGGATCCCTTTAAGGAG 
3361 + + + + + + 3420 
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GGAGTCCTTGGACAAAACAAGTGACCTTGTTACTCCTTTTTGGACCTAGGGAAATTCCTC 
PQEPVLFTGTMRKNLDPFKE - 

CACACGGATGAGGAACTGTGGAATGCCTTACAAGAGGTACAACTTAAAGAAACCATTGAA 
3421 + + - + + — + + 3480 

GTGTGCCTACTCCTTGACACCTTACGGAATGTTCTCCATGTTGAATTTCTTTGGTAACTT 
HTDEELWNALQEVQLK E T I E - 

G ATCTTCCTG GT A AAATG G ATACTG A ATT AG C AG A ATC AG G ATCC A ATTTT AG TGTTG G A 

3461 + + + + + + 3540 

CTAG AAGG ACC ATTTTACCT ATG ACTT AATCGTCTTAGTCCT AG GTT A A A ATC AC A ACCT 

DLPGKMDTELAESGSNFSVG - 

CAAAGACAACTGGTGTGCCTTGCCAGGGCAATTCTCAGGAAAAATCAGATATTGATTATT 
3541 + + + + + + 3600 

GTTTCTGTTGACCACACGGAACGGTCCCGTTAAGAGTCCTTTTTAGTCTATAACTAATAA 
QRGLVCLARAILRKNGILII - 

GATGAAGCGACGGCAAATGTGGATCCAAGAACTGATGAGTTAATACAAAAAAAAATCCGG 
3601 + + + + + + 3660 

CTACTTCG CTG CCGTTT AC ACCT AG GTTCTTG ACT ACTC AATTATG TTTTTTTTTAG G CC 
DEATANVDPRTOELIQKKIR - 

G AG AAATTTGCCC ACTGCACCGTGCTAACCATTG CAC AC AG ATTG AACACC ATTATTG AC 

3661 + + + + + + 3720 

CTCTTTAAACGGGTG ACGTG GC ACG ATTGGTAACGTGTGTCTAACTTGTG GTAATAACTG 

EKFAHCTVLTIAHRLNTIID - 

AGCGACAAGATAATGGTTTTAGATTCAGGAAGACTGAAAGAATATGATGAGCCGTATGTT 
3721 + + + + + + 3780 

TCGCTGTTCTATTACCAAAATCTAAGTCCTTCTGACTTTCTTATACTACTCGGCATACAA 
SDKIMVLOSGRLKEYDEPYV- 

TTGCTGCAAAATAAAGAGAGCCTATTTTACAAGATGGTGCAACAACTGGGCAAGGCAGAA 
3781 + + 4- + 4- 3840 

AACGACGTnTTATTTCTCTCGGATAAAATGTTCTACCACGTTGTTGACCCGTTCCGTCTT 
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a LLQNKESLFYKMVQQLGKAE 

GCCGCTGCCCTCACTGAAACAGCAAAACAGGTATACTTCAAAAG AAATTATCCACATATT 
3841 + + + + - + — + 3900 

CGGCGACGGG AGTGACTTTGTCGTTTTGTCCATATG AAGTTTTCTTTAATAGGTGTATAA 
a AAALTETAKQVYFKRNYPHI - 

GGTCACACTGACCACATGGTTACAAACACTTCCAATGGACAGCCCTCGACCTTAACTATT 
3901 + + + + + „ + 3g60 

CCAGTGTGACTGGTGTACCAATGTTTGTGAAGGTTACCTGTCGGGAGCTGGAATTGATAA 

a G HTDHMVTNTSNGQPSTLTI - 

TTCG AG AC AG CACTG 
3961 + 3975 

AAG CTCTG TCGTG AC 
a FETAL- 
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MOAT C cDNA AND AMINO ACID SEQUENCE ENCODED THEREBY 

ATGAAGGATATCGACATAGGAAAAG AGTATATCATCCCCAGTCCTGGGTATAGAAGTGTG 
1 + + - + + + + 60 

TACTTCCTATAGCTGTATCCTTTTCTCATATAGTAGG.GGTCAGGACCCATATCTTCACAC 
a MKDIDIGKEYIIPSPGYRSV - 

AGGGAGAGAACCAGCACTTCTGGGACGCACAGAGACCGTGAAGATTCCAAGTTCAGGAGA 
61 + + + + + + 12Q 

TCCCTCTCTTG G TCGTG A AG ACCCTG CG TG TCTCTG G C ACTTCT A AG G TTC A AG TCCTCT 
a RERTSTSGTHRDREDSKFRR - 

ACTCGACCGTTGGAATGCCAAGATGCCTTGGAAACAGCAGCCCGAGCGGAGGGCCTCTCT 
121 + + + + _ + + 18Q 

TGAGCTGGCAACCTTACGGTTCTACGGAACCTTTGTCGTCGGGCTCGGCTCCCGGAGAGA 
a TRPLECQDALETAARAEGLS - 

CTTG ATG CCTCC ATGCATTCTCAG CTCAG AATCCTG G ATG AG G AG C ATCCC AAGG G AA AG 
181 + + + 4- + + 240 

GAACTACGGAGGTACGTAAGAGTCGAGTCTTAGGACCTACTCCTCGTAGGGTTCCCTTTC 
a LDASMHSQLRILDEEHPKGK - 

TACC ATC ATG G CTTG AGTG CTCTG AAG CCC ATCCGG ACTACTTCC AAAC ACC AG C ACCC A 
241 + + + + + + 300 

ATGGTAGTACCG AACTC ACG AG ACTTCG G GTAG G CCTG ATG AAG GTTTGTG GTCGTGGGT 
a YHHGLSALKPIRTTSKHQHP - 

GTGGACAATGCTGGGCTTTTTTCCTGTATGACTTTTTCGTGGCTTTCTTCTCTGGC 
301 + + + + + + 360 

CACCTGTTACGACCCGAAAAAAGGACATACTGAAAAAGCACCGAAAGAAGAGACCGGGCA 
a VDNAGLFSCMTFSWLSSLAR- 

GTGGCCCACAAGAAGGGGGAGCTCTCAATGGAAGACGTGTGGTCTCTGTCCAAGCACGAG 
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361 + + + + + + 42Q 

CACCGGGTGTTCTTCCCCCTCGAGAGTTACCTTCTGCACACCAGAG ACAGGTTCGTGCTC 
a VAHKKGELSMEDVWSLSKHE - 

TCTTCTGACGTGAACTGCAGAAGACTAG AGAGACTGTGGCAAGAAGAGCTGAATGAAGTT 
421 + + + : + + + 480 

AGAAGACTGCACTTGACGTCTTCTGATCTCTCTGACACCGTTCTTCTCGACTT/VCTTCAA 
a SSDVNCRRLERLWQEELNEV - 



GGGCCAGACGCTGCTTCCCTGCGAAGGGTTGTGTGGATCTTCTGCCGCACCAGGCTCATC 
1 + + + + + + 540 

CCCGGTCTGCGACGAAGGGACGCTTCCCAACACACCTAGAAGACGGCGTGGTCCGAGTAG 



a GPDAASLRRVVWIFCRT 



R L I 



CTGTCCATCGTGTGCCTGATGATCACGCAGCTGGCTGGCTTCAGTGGACCAGCCTTCATG 
541 + + + + + + 600 

GACAGGTAGCACACGGACTACTAGTGCGTCGACCGACCGAAGTCACCTGGTCGGAAGTAC 
a LSIVCLMITQLAGFSGPAFM - 

GTGAAACACCTCTTGGAGTATACCCAGGCAACAGAGTCTAACCTGCAGTACAGCTTGTTG 
601 + + + + + + 660 

CACTTTGTGGAGAACCTCATATGGGTCCGTTGTCTCAGATTGGACGTCATGTCGAACAAC 
a VKHLLEYTQATESNLQYSLL - 

TrAGTGCTGGGCCTCCTCCTGACGGAAATCGTGCGGTCTrGGTCGCTTGCACTGACTTGG 
661 + + + + + + 720 

AATCACGACCCGGAGGAGGACTGCCTTTAGCACGCCAGAACCAGCGAACGTGACTGAACC 
a LVLGLLLTEIVRSWSLALTW- 



GCATTGAATTACCGAACCGGTGTCCGCTTGCGGGGGGCCATCCTAACCATGGCATTTAAG 
1 ■+ + + + + + 780 

CGTAACTTAATGGCTTGGCCACAGGCGAACGCCCCCCGGTAGGATTGGTACCGTAAATTC 



a ALNYRTGVRLRGAILTMAFK 



AAGATCCTTAAGTTAAAGAACAfTAAAGAGAAATCCCTGGGTGAGCTCATCAACATTTGC 

— + + + 840 



781 + + + 
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TTCTAGGAATTCAATTTCTTGTAATTTCTCTTTAGGGACCCACTCGAGTAGTTGTAAACG 
a KILKLKNIKEKSLG ELINIC - 

TCCAACGATGGGCAGAGAATGTTTGAGGCAGCAGCCGTTGGCAGCCTGCTGGCTGGAGGA 
841 + + + + — + + 900 

AGGTTGCTACCCGTCTCTTACAAACTCCGTCGTCGGCAACCGTCGGACGACCGACCTCCT 
a SNDGQRMFEAAAVGSLLAGG - 

CCCGTTGTTGCCATCTTAGGCATGATTTATAATGTAATTATTCTGGGACCAACAGGCTTC 
901 + + + + + + g60 

GGGCAACAACGGTAGAATCCGTACTAAATATTACATTAATAAGACCCTGGTTGTCCGAAG 
a PVVAILGMIYNVIILGPTGF - 



CTGGGATCAGCTGTTTTTATCCTCTTTTACCCAGCAATGATGrrrGCATCACGGCTCACA 

— + + + 1020 



961 + + + 



GACCCTAGTCGACAAAAATAGGAGAAAATGGGTCGTTACTACAAACGTAGTGCCGAGTGT 
a LGSAVFILFYPAMMFASRLT - 

GCATATTTCAGGAGAAAATGCGTGGCCGCCACGGATGAACGTGTCCAGAAGATGAATGAA 
1021 + + + + + + 1080 

CGTATAAAGTCCTCTTTTACGCACCGGCGGTGCCTACTTGCACAGGTCTTCTACTTACTT 
a AYFRRKCVAATDERVQKMNE - 

GTTCTTACTTACATTAAATTTATCAAAATGTATGCCTGGGTCAAAGCATTTTCTCAGAGT 
1081 + + + + + + 1140 

CAAGAATGAATGTAATTTAAATAGTTTTACATACGGACCCAGTTTCGTAAAAGAGTCTCA 
a VLTYIKFIKMYAWVKAFSQS - 

GTTCAGAAAATCCGCGAGGAGGAGCGTCGGATATTGGAAAAAGCCGGGTACTTCCAGGGT 
1141 + + + + + + 120Q 

CAAGTCTTTTAGGCGCTCCTCCTCGCAGCCTATAACCTTTTTCGGCCCATGAAGGTCCCA 
a VQKIREEERRILEKAGYFQG - 

ATCACTGTGGGTGTGGCTCCCATTGTGGTGGTGATTGCCAGCGTGGTGACCTTCTCTGTT 
1201 + + + + + + 126Q 

TAGTGACACCCACACCGAGGGTAACACCACCACTAACGGTCGCACCACTGGAAGAGACAA 
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a 'TVGVAPIVVVIASVVTFSV- 

CATATGACCCTGGGCrrCGATCTGACAGCAGCACAGGCTrrCACAGTGGTGACAGTCTTC 
1261 • + + ~ + — - + + 1320 

GTATACTGGGACCCGAAGCTAGACTGTCGTCGTGTCCGAAAGTGTCACCACTGTCAGAAG 
a HMTLGFOLTAAQAFTVVTVF - 

AA-nCCATGACTTrrGCTrrGAAAGTAACACCGTrrrCAGTAAAGTCCCTCTCAGAAGCC 

132 I H 1- i , 

+ + + + + 1380 

TTAAGGTACTGAAAACGAAACTTTCATTGTGGCAAAAGTCATTTCAGGGAGAGTCTTCGG 
a NSMTFALKVTPFSVKSLSEA - 

TCAGTGGCTGTTGACAGATTTAAGAGTTTGTTTCTAATGGAAGAGGTTCACATGATAAAG 
1381 + + + + + + 144Q 

AGTCACCGACAACTGTCTAAATTCTCAAACAAAGATTACCTTCTCCAAGTGTACTATTTC 



a SVAVDRFKSLFLMEEVHMfK - 

AACAAACCAGCCAGTCCTCACATCAAGATAGAGATG AAAAATGCCACCTTGGCATGGGAC 
1441 + + + _ + + + i50Q 

TTGTTTGGTCGGTCAGGAGTGTAGTTCTATCTCTACTTnTACGGTGGAACCGTACCCTG 
a NKPASPHIKIEMKNATLAWD- 

TCCTCCCACTCCAGTATCCAGAACTCGCCCAAGCTGACCCCCAAAATGAAAAAAGACAAG 
1601 + + + + :+ + 156Q 

AGGAGGGTGAGGTCATAGGTCTTGAGCGGGTrCGACTGGGGGTTTTACTTTTrTCTGTTC 

a SSHSSIQNSPKLTPKMKKDK - 

AGGGCTTCCAGGGGCAAGAAAGAGAAGGTGAGGCAGCTGCAGCGCACTGAGCATCAGGCG 
1561 + + + + + + 162Q 

TCCCG AAG GTCCCCGTTCTTTCTCTTCC ACTCCG TCG ACG TCG CG TG ACTCGT AG TCCG C 
a RASRGKKEKVRQLQRTEHQA - 

GTGCTGGCAGAGCAGAAAGGCCACCTCCTCCTGGACAGTGACGAGCGGCCCAGTCCCGAA 
+ + + + 1680 

CACGACCGTCTCGTCTTTCCGGTGGAGGAGGACCTGTCACTGCTCGCCGGGTCAGGGCTT 
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a VLAEOKGHLLLDSDER 



P S P E 



GAGGAAGAAGGCAAGCACATCCACCTGGGCCACCTGCGCTTACAGAGGACACTGCACAGC 
1681 + -_ + + + + + 1?40 

CTCCTTCTTCCGTTCGTGTAGGTGGACCCGGTGGACGCGAATGTCTCCTGTGACGTGTCG 



3 EEEGKHIHLGHLR 



L Q R T L H S 



ATCGATCTGGAGATCCAAGAGGGTAAACTGGTTGGAATCTGCGGCAGTGTGGGAAGTGGA 
1741 + + + + + + 1800 

TAGCTAGACCTCTAGGTTCTCCCATTTGACCAACCTTAGACGCCGTCACACCCTTCACCT 



DLEIQEGKLVGICGSVG 



S G 



AAAACCTCTCTC ATTTC AG CC ATT" 



-rTAGGCCAGATGACGCTTCTAGAGGGCAGCATTGCA 
1801 + + + + + + 1860 

TTTTGGAGAGAGTAAAGTCGGTAAAATCCGGTCTACTGCGAAGATCTCCCGTCGTAACGT 
a KTSLISAILGQMTLLEGSIA - 

ATCAGTGGAACCTTCGCTTATGTGGCCCAGCAGGCCTGGATCCTCAATGCTACTCTGAGA 
1861 + + + + + + lg20 

TAGTCACCTTGGAAGCGAATACACCGGGTCGTCCGGACCTAGGAGTTACGATGAGACTCT 
a 'SGTFAYVAQQAWILNATLR- 

6ACAACATCCTGTTTGGGAAGGAATATGATGAAGAAAGATACAACTCTGTGCTGAACAGC 
1921 + + + + + + 1980 

CTGTTGTAGGACAAACCCTTCCTTATACT 
a DNILFGKEYDEERYNSVLNS - 

TGCTGCCTGAGGCCTGACCTGGCCATTCTTCCCAGCAGCGACCTGACGGAGATTGGAGAG 
1981 + + + + _ + + 204Q 

ACGACGGACTCCGGACTGGACCGGTAAGAAGGGTCGTCGCTGGACTGCCTCTAACCTCTC 
a CCLRPOLAILPSSDLTEIGE - 

CGAGGAGCCAACCTGAGCGGTGGGCAGCGCCAGAGGATCAGCCTTGCCCGGGCCTTGTAT 
2041 + + + + + + 2l0Q 

GCTCCTCGGTTGGACTCGCCACCCGTCGCGGTCTCCTAGTCGGAACGGGCCCGGAACATA 
a RGANLSGGQRQRISLARALY - 
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GGTACACCCG 

> sd "S'v, looplsaldahvg 

+ H + + 2220 

TTC GTC T A 0 AAGTTAT c A C GATA o G CCTTTGrACAOTTCAGaTTCT3TCAAOAC« A C AA 
• NHIF "SAinKHLKSKTVI.FV. 

222 T— C ^^ ACCTGG " GACTGTGATO « GT <=ATCTTC A rO AAAGAGaGCTC T 

* + + ~*~ + 2280 

t GG gtg G tc A at G tc A t G g a cc AA ct GA cactacttcactagaa G tactttctccc G aca 

■ TH 0<-QVLVDCDEVI FMKEGC . 

J^^O *«CACCC A TG A GG AA CT G AT G AATrrA AA T GGT GACTAT G CT A CCATT 

~ + + + + 2340 

T AA T G CCTTTCTCCG TGGGTAC TCCTT GA CT A Crr A « mA CC A CT GA T A C GA TG G r AA 

« itergtheelmnlngovati 

234^— ^t+^^^ g 4 ^ g _^ gggagaga ^ a ^ cgc ^^^ ga ^* t ^ aatt< - aaaaaaggaaagg 

-+ 2400 



AAATTATTGGACAACGACCCTCTCTGTGGCGGTCAACTC 



« PNNLLLGETPPVEl NSKK 

2401 + + ACAAGACAAGGGTC CTAAAACAGGATCAGTAAAGAAGGAA 

+ - + + 2460 



rCTAGTTAAGTTTTTTCCTTTGG 
E T - 

AGTGGTTCACAGAAGAAGTC, 
01 + + + 

TCACCAAGTGTCTTCTTCAGTGTTCTGTTCCCAGGATTTTGTCCTAGTCA^^ 

a sgsqkksqdkgpktgsvkke - 

^ CGTCA ^CGGTCT^^ 

a kavkpeegqlvqleekgqgs - 
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GTGCCCTGGTCAGTATATGGTGTCTACATCCAGGCTGCTGGGGGCCCCTTGGCATTCCTG 
2521 + + + ... + + + 2580 

CACGGGACCAGTCATATACCACAGATGTAGGTCCGACGACCCCCGGGGAACCGTAAGGAC 
a VPWSVYGVYIQAAGGPLAFL - 

GTTATTATGGCCCTTTTCATGCTGAATGTAGGCAGCACCGCCTTCAGCACCTGGTGGTTG 
2581 + + + + _ + + 264Q 

CAATAATACCGGGAAAAGTACGACTTACATCCGTCGTGGCGGAAGT.CGTGGACCACCAAC 

a VIMALFMLNVGSTAFSTWWL - 

AGTTACTGGATCAAGCAAGGAAGCGGGAACACCACTGTGACTCGAGGGAACGAGACCTCG 
2641 + + + + + + 2?oo 

TC AATG ACCTAGTTCG TTCCTTCG CCCTTGTG GTG AC ACTG AG CTCCCTTGCTCTG G AG C 
a SYWIKQGSGNTTVTRGNETS- 

GTGAGTGACAGCATGAAGGACAATCCTCATATGCAGTACTATGCCAGCATCTACGCCCTC 
2701 + + + + + + 276Q 

CACTCACTGTCGTACTTCCTGTTAGGAGTATACGTCATGATACGGTCGTAGATGCGGGAG 
a VSDSMKDNPHMQYYASIYAL- 

TCCATGGCAGTCATGCTGATCCTGAAAGCCATTCGAGGAGTTGTCTTTGTCAAGGGCACG 
2761 + + + + + + 2820 

AGGTACCGTCAGTACGACTAGGACTTTCGGTAAGCTCCTCAACAGAAACAGTTCCCGTGC 
a SMAVMLILKAIRGVVFVKGT - 

CTGCGAGCTTCCTCCCGGCTGCATGACGAGCTTTTCCGAAGGATCCTTCGAAGCCCTATG 
2821 + + + + + + 2880 

GACGCTCGAAGGAGGGCCGACGTACTGCTCGAAAAGGCTTCCTAGGAAGCTTCGGGATAC 

a LRASSRLHDELFRRILRSPM - 

AAGTTTTTTGACACGACCCCCACAGGGAGGATTCTCAACAGGTTTTCCAAAGACATGGAT 
2881 + + + + + + 2940 

TTCAAAAAACTGTGCTGGGGGTGTCCCTCCTAAGAGTTGTCCAAAAGGTTTCTGTACCTA 

a KFFOTTPTGRILNRFSKOMD - 

GAAGTTGACGTGCGGCTGCCG-rrCCAGGCCGAGATGTTCATCCAGAACGTTATCCTGGTG 
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2941 + — + + + -- + + 3000 

CTTCAACTGCACGCCGACGGCAAGGTCCGGCTCTACAAGTAGGTCTTGCAATAGGACCAC 



a EVDVRLPFQAEMFIQNv 



I L V 



TTCTTCTGTGTGGGAATGATCGCAGGAGTCTTCCCGTGGTTCCTTGTGGCAGTGGGGCCC 
3001 + -- + + + + - + 3060 

AAGAAG ACACACCCTTACTAGCGTCCTCAGAAGGGCACCAAGGAACACCG"! CACCCCGGG 



a FFCVGMIAGVFPWFLVAV 



G P 



CTTGTCATCCTCTTTTCAGTCCTGCACATTGTCTCCAGGGTCCTGATTCGGGAGCTGAAG 
3061 + + + + + + 3120 

GAACAGTAGGAGAAAAGTCAGGACGTGTAACAGAGGTCCCAGGACTAAGCCCTCGACTTC 
a LVILFSVLHIVSRVLIRELK- 

CGTCTG G AC AATATC ACG CAGTCACCTTTCCTCTCCCAC ATC ACGTCC AG CATACAG GG C 
3121 + + + + + + 3180 

GCAGACCTGTTATAGTGCGTCAGTGGAAAGGAGAGGGTGTAGTGCAGGTCGTATGTCCCG 
a RLDNITQSPFLSHITSSIQG - 



CTTG CC ACC ATCC ACG CCTAC AATA AAG G GC AG GAG TTTCTG C AC AG AT ACC AG G AG CTG 
51 + + + + + + 3240 

GAACGGTGGTAGGTGCGGATGTTATTTCCCGTCCTCAAAGACGTGTCTATGGTCCTCGAC 



a LATIHAYNKGQEFLHRYQEL - 

CTG G ATG AC AACC AAG CTCC I I I I 1 I I I I GTTTACGTGTGCG ATGCGGTGGCTGGCTGTG 
3241 + + + + + + 3300 

GACCTACTGTTGGTTCGAGGAAAAAAAAACAAATGCACACGCTACGCCACCGACCGACAC 



a LDDNQAPFFLFTCAMRW 



LAV 



CGGCTGGACCTCATCAGCATCGCCCTCATCACCACCACGGGGCTGATGATCGTTCTTATG 
3301 + + + + + + 3360 

GCCGACCTGGAGTAGTCGTAGCGGGAGTAGTGGTGGTGCCCCGACTACTAGCAAGAATAC 
a RLDLISIALITTTGLMIVLM - 

CACGGGCAGATTCCCCCAGCCTATGCGGGTCTCGCCATCTCTTATGCTGTCCAGTTAACG 
3361 + + + + + + 3420 
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GTGCCCGTCTAAGGGGGTCGGATACGCCCAGAGCGGTAGAGAATACGACAGGTCAATTGC 
a HGQIPPAYAGLAISYAVQLT - 

GGGCTGTTCCAGTTTACGGTCAGACTGGCATCTGAGACAGAAGCTCGATTCACCTCGGTG 
3421 + + - + - ♦ + 3480 

CCCGACAAGGTCAAATGCCAGTCTGACCGTAGACTCTGTCTTCGAGCTAAGTGGAGCCAC 
a GLFQFTVRLASETEARFTSV - 

GAGAGGATCAATCACTACATTAAGACTCTGTCCTTGGAAGCACCTGCCAGAATTAAGAAC 
3481 + + + + + + 354Q 

CTCTCCTAGTTAGTGATGTAATTCTGAGACAGGAACCTTCGTGGACGGTCTTAATTCTTG 
a ERINHYIKTLSLEAPARIKN - 

AAGGCTCCCTCCCCTGACTGGCCCCAGGAGGGAGAGGTGACCTTTGAGAACGCAGAGATG 
3541 + + + + + + 360Q 

TTCCG AG G G AG G G G ACTG ACCG G G G TCCTCCCTCTCC ACTG G A A ACTCTTG CG TCTCT AC 
a KAPSPDWPQEGEV tf e n a e M - 

AGGTACCGAGAAAACCTCCCTCTTGTCCTAAAGAAAGTATCCTTCACGATCAAACCTAAA 
3601 + + + + + + 3660 

TCC ATG G CTCTTTTG G AG G G AG A AC AGG ATTTCTTTC AT AG G A AGTG CT AGTTTG G ATTT 
a RVRENLPLVLKKVSFTIKPK - 

GAGAAGATTGGCATTGTGGGGCGGACAGGATCAGGGAAGTCCTCGCTGGGGATGGCCCTC 
3661 + + + + + + 3720 

CTCTTCTAACCGTAACACCCCGCCTGTCCTAGTCCCTTCAGGAGCGACCCCTACCGGGAG 
a EKIGIVGRTGSGKSSLGMAL - 

TTCCGTCTGGTGGAGTTATCTGGAGGCTGCATCAAGATTGATGGAGTGAGAATCAGTGAT 
3721 + + + + + + 3780 

AAGGCAGACCACCTCAATAGACCTCCGACGTAGTTCTAACTACCTCACTCTTAGTCACTA 
a FBLVELSGGCIKIDGVRISD - 

ATT GGCCTTGCCGACCTCCGAAGCAAACTCTCTATCATTCCTCAAGAGCCGGTGCTGTTC 
3781 + + + + + + 384Q 



TAACCGGAACGGCTGGAGGCTTCGTTTGAGAGATAGTAAGGAGTTCTCGGCC 
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a IGLADLRSKLSllPQEPVLF 

AGTGGCACTGTCAGATCAAATTTGGACCCCTTCAACCAGTACACTO 



A AG ACC AG ATTTG G 

+ + + 3900 

TCACCGTGACAGTCTAGTTTAAACCTGGGG^ 



3841 + „ + . + 



5AAGTTGGTCATGTGACTTCTGGTCTAAACC 
a SGTV RSNLDPFNQYTEDQIW- 

GATGCCCTGGAGAGGACACACATGAAAGAATGTATTGCTCAGCTACeTCTGAAACTTGAA 
3901 + + + + + + 

CTACGGGACCTCTCCTGTGTGTACTTTCTTACATAACGAGTCGATGGAGACTTTGAACTT 



a DALERTHMK 



ECIAQLPLKLE 



^TCTGAAGTGATGGAGAATGGGGATAACTTCTCAGTGGGGGAACGGCAGCTCTTGTGCATA 

+ "~ " + + + + + 4020 

AGACTTCACTACCTCTTACCCCTATTGAAGAGTCACCCCCTTGCCGTCGAGAACACGTAT 
a SEVMENGDNFSVGERQLLCI - 

GCTAGAGCCCTGCTCCGCCACTGTAAGATTCTGATnTAGATGAAGCCACAGCTGCCATG 
4021—- + + + + + + 4oso 

CGATCTCGGGACGAGGCGGTGACATTCTAAGACTAAAATCTACTTCGGTGTCGACGGTAe 



a A RALLRHCKILILDEATAAM 



GACACAGAGACAGACTTATTGATTCAAGAGACCATCCGAGAAGCATTTC 

ACATGG 



4081 • TGCAGACTGTACC 

+ + + + + + 4140 

CTGTGTCTCTGTCTG AATAACTAAGTTCTCTG GTAG G CTCTTCGTAAACGTCTG 



a DTE TDLLIQETIREAFADCT- 

ATGCTGACCATTGCCCATCGCCTGCACACGGTTCTAGGCTCCGATAGGATTATGGTGCTG 

+ + + + 4200 

TACGACTGGTAACGGGTAGCGGACGTGTGCCAAGATCCGAGGCTATCCTAATACCACGAC 
a ^'"^'AHRLHTVLGSDRIMVL - 

4 20 f^ GG !^ GGTGGTGGAG ^ 

+ * " + + 4260 



CGGGTCCCTGTCCACCACCTCAAACTGTGGGGTAGCC 



AGGAAGACAGGTTGCTGTCAAGG 
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a AQGQVVEfDTPSVLLSNDSS - 

CGATTCTATGCCATGTTTGCTGCTGCAG AGAACAAGGTCGCTGTCAAGGGCTGA 
4261 + + + + + ..„ 4314 

GCTAAGATACGGTACAAACGACGACGTCTCTTGTTCCAGCGACAGTTCCCG ACT 
a RFYAMFAAACNKVAVKG* - 
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MOAT D cDNA AND AMINO ACID SEQUENCE ENCODED THEREBY 

GACTCCAACCTGTCT 
ACAGA 



ATGGACGCCCTGTGCGGTTCCGGGGAGCTCGGCTCCAAGTTCTGG 
1 + + + + + + 6Q 

TACCTGCGGGACACGCCAAGGCCCCTCGAGCCGAGGTTCAAGACCCTGAGGTTGG 



' MDALCGSGELGSKFWDSNLS - 

GTGCACACAGAAAACCCGGACCTCACTCCCTGCTTCCAGAACTCCCTGCTGGCCTGGGTG 
61 + + + + + + 

CACGTGTGTCTTTTGGGCCTGGAGTGAGGGACGAAGGTCTTGAGGGACGACCGGACCCAC 

VHTENPOLTPCFQNSLLAWV - 

CCCTGCATCTACCTGTGGGTCGCCCTGCCCTGCTACTTGCTCTACCTGCGGCACCATTGT 
121 + + + + + + 18Q 

GGGACGTAGATGGACACCCAGCGGGACGGGACGATGAACGAGATGGACGCCGTGGTAACA 

PCIYLWVALPCYLLYLRHHC - 

CGTGGCTACATCATCCTCTCCCACCTGTCCAAGCTCAAGATGGTCCTGGGTGTCCTGCTG 
181 + + + + + + 240 

GCACCGATGTAGTAGGAGAGGGTGGACAGGTTCGAGTTCTACCAGGACCCACAGGACGAC 
RGYIILSHLSKLKMVLGVLL - 



TGGTGCGTCTCCTGGGCGGACCTTTTF 
— + + + + 



rACTCCTTCCATGGCCTGGTCCATGGCCGGGCC 
241 + + ^ — ■ + 300 



ACCACGCAGAGGACCCGCCTGGAAAAAATGAGGAAGGTACCGGACCAGGTACCGGCCCGG 
WCVSWADLFYSFHGLVHGRA - 

CCTGCCCCTGTTTTCTTTGTCACCCCCTTGGTGGTGGGGGTCACCATGCTGCTGGCCACC 
51 + + + + + + 360 

GGACGGGGACAAAAGAAACAGTGGGGGAACCACCACCCCCAGTGGTACGACGACCGGTGG 
PAPVFFVTPLVVGVTMLLAT - 

CTGCTGATACAGTATGAGCGGCTGCAGGGCGTACAGTCTTCGGGGGTCCTCATTATCTTC 
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361 + + + + — - + + 420 

GACGACTATGTCATACTCGCCGACGTCCCGCATGTCAGAAGCCCCCAGGAGTAATAGAAG 

a LLIQYERLQGVGSSGVLIIF - 

TGGTTCCTGTGTGTGGTCTGCGCCATCGTCCCATTCCGCTCCAAGATCCTTTTAGCCAAG 
421 + + + + + + 480 

ACCAAGGACACACACCAG ACGCGG TAGCAGGGTAAGGCG AGGTTCTAGGAAAATCGGTTC 
a WFLCVVCAIVPFRSKILLAK - 

GCAGAGGGTGAGATCTCAGACCCCTTCCGCTTCACCACCTTCTACATCCAC7TTGCCCTG 
481 + + + 4- 4- + 540 

CGTCTCCCACTCTAGAGTCTGGGGAAGGCGAAGTGGTGGAAGATGTAGGTGAAACGGGAC 
a AEGEISDPFRFTTFYIHFAL - 



GTACTCTCTGCCCTCATCTTGGCCTGCTTCAGGGAGAAACCTCC 
541 4- + + + + + 600 



ATTTTTCTCCG CAAAG 



a 



CATGAGAGACGGGAGTAGAACCGGACGAAGTCCCTCTTTGGAGGTAAAAAGAGGCGTTTC 
VLSALILACFREKPPFFSAK - 

AATGTCG ACCCTAACCCCTACCCTG AG ACC AGCGCTG G CTTTCTCTCCCG CCTGTTTTTC 
601 + + + 4- + + 660 

TTACAGCTGGGATTGGGGATGGGACTCTGGTCGCGACCGAAAGAGAGGGCGGACAAAAAG 
NVDPNPYPETSAG FLSRLFF - 

TG GTGGTTCAC AAAG ATGG CCATCTATG G CTACCG G C ATCCCCTG G AGG AG AAG G ACCTC 
661 + + + + + + 7 2o 

ACCACC AAGTGTTTCTACCG G TAG ATACCG ATG G CCG TAG G G G ACCTCCTCTTCCTG GAG 
a WWFTKMAIYGYRHPLEEKDL - 

TGGTCCCTAAAGGAAGAGGACAGATCCCAGATGGTGGTGCAGCAGCTGCTGGAGGCATGG 
721 + + + 4- 4- 4- 780 

ACCAGGGATTTCCTTCTCCTGTCTAGGGTCTACCACCACGTCGTCGACGACCTCCGTACC 
a WSLKEEDRSQMVVGQLLEAW - 

AGGAAGCAGGAAAAGCAGACGGCACGACACAAGGCTTCAGCAGCACCTGGGAAAAATGCC 
781 + + + 4- 4- + 840 
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TCCTTCGTCCTTTTCGTCTGCCGTGCTGTGTTCCGAAGTCGTCGTGGACCCTTTTTACGG 
a RKQEKQTARHKASAAPGKNA - 

TCCGGCGAGGACGAGGTGCTGCTGGGTGCCCGGCCCAGGCCCCGGAAGCCCTCCTTCCTG 

841 " + + — - + + + + 900 

AGGCCGCTCCTGCTCCACGACG ACCCACGGGCCGGGTCCGGGGCCTTCGGGAGGAAGGAC 
a SGEDEVLLGARPRPRKPSFL - 

AAGGCCCTGCTGGCCACCTTCGGCTCCAGCTTCCTCATCAGTGCCTGCTTCAAGCTTATC 
901 + + + + + + g60 

TTCCGGGACGACCGGTGGAAGCCGAGGTCGAAGGAGTAGTCACGGACGAAGTTCGAATAG 
a KALLATFGSSFLISACFKLI - 

CAGGACCTGCTCTCCTTCATCAATCCACAGCTGCTCAGCATCCTGATCAGGTTTATCTCC 
961 + + + + + + 1020 

GTCCTGGACGAGAGGAAGTAGTTAGGTGTCGACGAGTCGTAGGACTAGTCCAAATAGAGG 
a QDLLSFINPQLLSILIRFIS - 

AACCCCATGGCCCCCTCCTGGTGGGGCTTCCTGGTGGCTGGGCTGATGTTCCTGTGCTCC 
1021 + + + + + + 1080 

TTGGGGTACCGGGGGAGGACCACCCCGAAGGACCACCGACCCGACTACAAGGACACGAGG 
a NPMAPSWWGFLVAGtMFLCS - 

ATG ATGC AGTCGCTG ATCTT AC AAC ACT ATTACCACT AC ATCTTTG TG ACTGG G GTG AAG 

1081 + + + + + + 1140 

TACTACG TC AG CG ACTAG AATGTTG TG ATAATG GTG ATG TAG AAAC ACTG ACCCCACTTC 

a MMQSLILQHYYHYIFVTGVK - 

TTTCGTACTGGGATCATGGGTGTCATCTACAGGAAGGCTCTGGTTATCACCAACTCAGTC 
1141 + + + 4- + + 1200 

AAAGCATGACCCTAGTACCCACAGTAGATGTCCTTCCGAGACCAATAGTGGTTGAGTCAG 
a FRTGIMGV1YRKALVITNSV - 

AAACGTGCGTCCACTGTGGGGGAAATTGTCAACCTCATGTCAGTGGATGCCCAGCGCTTC 
1201 + + + _ + + + 1260 

TTTGCACGCAGGTGACACCCCCTTTAACAGTTGGAGTACAGTCACCTACGGGTCGCGAAG 
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a KRASTVGEIVNLMSVDAQRF - 

ATGGACCTTGCCCCCTTCCTCAATCTGCTGTGGTCAGCACCCCTGCAGATCATCCTGGCG 
1261 + + — + + + _. + 1320 

TACCTGGAACGGGGGAAGGAGTTAGACGACACCAGTCGTGGGGACGTCTAGTAGGACCGC 
a MDLAPFLNLLWSAPLQIILA 



ATCTACTTCCTCTGGCAGAACCTAGGTCCCTCTGTCCTGGCTGGAGTCGCTTTCATGGTC 
>1 + + + + + + 13g0 

TAGATGAAGGAGACCGTCTTGGATCCAGGGAGACAGGACCGACCTCAGCGAAAGTACCAG 



a IYFLWQNLGPSVLAGVAFMV - 

TTGCTGATTCCACTCAACGGAGCTGTGGCCGTGAAGATGCGCGCCTTCCAGGTAAAGCAA 
1381 + + + + + + 144Q 

AACGACTAAGGTGAGTTGCCTCGACACCGGCACTTCTACGCGCGGAAGGTCCATTTCGTT 



a LLIPLNGAVAVKMRAFQV 



K Q 



ATGAAATTGAAGGACTCGCGCATCAAGCTGATGAGTGAGATCCTGAACGGCATCAAGGTG 
1441 + + + + + + 1500 

TACTTTAACTTCCTGAGCGCGTAGTTCGACTACTCACTCTAGGACTTGCCGTAGTTCCAC 
a MKLKDSRIKLMSEILNGIKV - 

CTGAAGCTGTACGCCTGGGAGCCCAGCTTCCTGAAGCAGGTGGAGGGCATCCGGCAGGGT 
1501 + + + + + + 1560 

G ACTTCG ACATG CG G ACCCTCG G GTCG AAGG ACTTCG TCC ACCTCCCGTAG GCCGTCCC A 
a LKLYAWEPSFLKQVEGIRQG - 

GAGCTCCAGCTGCTGCGCACGGCGGCCTACCTCCACACCACAACCACCTTCACCTGGATG 
1561 + + + + + + 162Q 

CTCGAGGTCGACGACGCGTGCCGCCGGATGGAGGTGTGGTGTTGGTGGAAGTGGACCTAC 
a E «-QLLRTAAYLHTTTTFTWM - 

TGCAGCCCCTTCCTGGTGACCCTGATCACCCTCTGGGTGTACGTGTACGTGGACCCAAAC 
1621 + + + + + + 168Q 

ACGTCGGGGAAGGACCACTGGGACTAGTGGGAGACCCACATGCACATGCACCTGGGTTTG 
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CSPFLVTL1TLWVYVYVDPN - 

AATGTGCTGGACGCCGAGAAGGCCTTTGTGTCTGTGTCCTTGTTTAATATCTTAAGACTT 
1681 + + — + + _ + + 1740 

TTACACGACCTGCGGCTCTTCCGGAAACACAGACACAGGAACAAATTATAGAATTCTGAA 
NVLDAEKAFVSVSLFNILRL - 

CCCCTCAACATGCTGCCCCAGTTAATCAGCAACCTGACTCAGGCCAGTGTGTCTCTGAAA 
1741 + + + _ + + + 180Q 

GGGGAGTTGTACGACGGGGTCAATTAGTCGTTGGACTGAGTCCGGTCACACAGAGACTTT 
PLNMLPQLI SNLTQASVSLK - 

CGGATCCAGCAATTCCTGAGCCAAGAGGAACTTGACCCCCAGAGTGTGGAAAGAAAGACC 
1801 + + + + + + i860 

G CCT AG GTCG TTAAG G ACTCG G TTCTCCTTG AACTG G G G G TCTC AC ACCTTTCTTTCTG G 
RIQQFLSQEELDPQSVERKT - 

ATCTCCCCAGGCTATGCCATCACCATACACAGTGGCACCTTCACCTGGGCCCAGGACCTG 
1861 + + + + + + 1920 

TAGAGGGGTCCGATACGGTAGTGGTATGTGTCACCGTGGAAGTGGACCCGGGTCCTGGAC 
ISPGYAITI HSGTFTWAQDL - 

CCCCCCACTCTGCACAGCCTAGACATCCAGGTCCCGAAAGGGGCACTGGTGGCCGTGGTG 
1921 + + + + + + 1980 

GGGGGGTGAGACGTGTCGGATCTGTAGGTCCAGGGCTTTCCCCGTGACCACCGGCACCAC 
PPTLHSLDI QVPKGALVAVV - 

GGGCCTGTGGGCTGTGGGAAGTCCTCCCTGGTGTCTGCCCTGCTGGGAGAGATGGAGAAG 
1981 + + + + + + 2040 

CCCGGACACCCGACACCCTTCAGGAGGGACCACAGACGGGACGACCCTCTCTACCTCTTC 
GPVGCGKSSLVSALLGEMEK - 

CTAGAAGGCAAAGTGCACATGAAGGCATGGATCCAGAACTGCACTCTTCAGGAAAACGTG 
2041 + + + + + + 2100 

GATCTTCCGTTTCACGTGTACTTCCGTACCTAGGTCTTGACGTGAGAAGTCCTTTTGCAC 
LEGKVHMKAWIQNCTLQENV 
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Crm-CGGCAAAGCCCTGAACCCCAAGCGCTACCAGCAGACTCTGGAGGCCTGTG 



2101 + + + + 



CCTTG 

+ + 2160 

GAAAAGCCGTTTCGGGACTTGGGGTTCGCGATGGTCGTCTGAGACCTCCGGACACGGAAC 



a LFGKALNPKRYQQTLEACAL - 

CTAGCTGACCTGGAGATGCTGCCTGGTGGGGATCAGACAGAGATTGGAGAGAAGGGCATT 
2161 + -_ + + + + + 222Q 

GATCGACTGGACCTCTACGACGGACCACCCCTAGTCTGTCTCTAACCTCTCTTCCCGTAA 
a LAOLEMLPGGDQTEIGEKGI - 

AACCTGTCTGGGGGCCAGCGGCAGCGGGTCAGTCTGGCTCGAGCTGTTTACAGTGATGCC 
2221 + + + + + + 2280 

TTGGACAGACCCCCGGTCGCCGTCGCCCAGTCAGACCGAGCTCGACAAATGTCACTACGG 
a NLSGGQRQRVSLARAVYSDA - 



GATA' I 111 ' 
2281 + + + + 



CTTGCTGGATGACCCACTGTCCGCGGTGGACTCTCATGTGGCCAAGCACATC 

— + + 2340 

CTATAAAAGAACGACCTACTGGGTGACAGGCGCCACCTGAGAGTACACCGGTTCGTGTAG 
a DIFLLDDPLSAVDSHVAKHI - 

TTTGACCACGTCATCGGGCCAGAAGGCGTGCTGGCAGGCAAGACGCGAGTGCTGGTGACG 
2341 + + + + + + 2400 

AAACTGGTGCAGTAGCCCGGTCTTCCGCACGACCGTCCGTTCTGCGCTCACGACCACTGC 
a FDHVIGPEGVLAGKTRVLVT - 

CACGGCATTAGCTTCCTGCCCCAGACAGACTTCATCATTGTGCTAGCTGATGGACAGGTG 
2401 + + + + + + 2460 

GTGCCGTAATCGAAGGACGGGGTCTGTCTGAAGTAGTAACACGATCGACTACCTGTCCAC 
a HGISFLPQTDFIIVLAOGQV - 



TCTGAGATGGGCCCGTACCCAGCCCTGCTGCAGCGCAACGGCTCCTTTGCCAACTTTCTC 

+ + + 2520 



2461 + + + 



AGACTCTACCCGGGCATGGGTCGGGACGACGTCGCGTTGCCGAGGAAACGGTTGAAAGAG 
a SEMGPYPALLQRNGSFANFL - 
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TGCAACTATGCCCCCGATGAGGACCAAGGGCACCTGGAGG ACAGCTGGACCGCGTTGGAA 

2521 + + + + + -- + 2580 

ACGTTGATACGGGGGCTACTCCTGGTTCCCGTGGACCTCCTGTCGACCTGGCGCAACCTT 

a CNYAPDEDQGHLEDSWTALE - 

GGTGCAGAGG ATAAGGAGGCACTGCTGATTGAAGACACACTCAGCAACCACACGG ATCTG 
258 1 + + + — + ™ + + 2640 

CCACGTCTCCTATTCCTCCGTGACGACTAACTTCTGTGTGAGTCGTTGGTGTGCCTAGAC 
a GAEDKEALLIEDTLSNHTDL - 

ACAGACAATGATCCAGTCACCTATGTGGTCCAGAAGCAGTTTATGAGACAGCTGAGTGCC 
2641 + + + + + + 2700 

TGTCTG TTACTAGGTC AGTG G AT AC ACC AG GTCTTCGTC A A AT ACTCTG TCG ACTCACG G 
a TDNDPVTYVVQKQFMRQLSA - 

CTGTCCTCAGATGGGGAGGGACAGGGTCGGCCTGTACCCCGGAGGCACCTGGGTCCATCA 
2701 + + + + + + 2760 

GACAGGAGTCTACCCCTCCCTGTCCCAGCCGGACATGGGGCCTCCGTGGACCCAGGTAGT 
a LSSDGEGQGRPVPRRHLGPS - 

GAGAAGGTGCAGGTGACAGAGGCGAAGGCAGATGGGGCACTGACCCAGGAGGAGAAAGCA 
2761 + + + + + 2820 

CTCTTCCACGTCCACTGTCTCCGCTTCCGTCTACCCCGTGACTGGGTCCTCCTCTTTCGT 

a EKVQVTEAKADGALTQEEKA - 

G CC ATTGGCACTGTGG AGCTC AG TG TG TTCTG GG ATT ATG CCA AG G CCG TG G G G CTCTGT 
2821 + + + + + + 2880 

CGGTAACCGTGACACCTCGAGTCACACAAGACCCTAATACGGTTCCGGCACCCCGAGACA 
a AIGTVELSVFWDYAKAVGLC - 

ACCACGCTGGCCATCTGTCTCCTGTATGTGGGTCAAAGTGCGGCTGCCATTGGAGCCAAT 
2881 + + + + + + 2940 

TGGTGCGACCGGTAGACAGAGGACATACACCCAGTTTCACGCCGACGGTAACCTCGGTTA 
a TTLAICLLYVGQSAAA1GAN 

GTGTGGCTCAGTGCCTGGACAAATGATGCCATGGCAGACAGTAGACAGAACAACACTTCC 



Figure 14G 

SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 



PCT/US99/06644 



42/56 



294 1 + 4- + + + + 3000 

CACACCGAGTCACGGACCTGTTTACTACGGTACCGTCTGTCATCTGTCTTGTTGTGAAGG 

a VWLSAWTNDAMADS R G N N T S 

CTGAGGCTGGGCGTCTATGCTGCTTTAGGAATTCTGCAAGGGTTCTTGGTGATGCTGGCA 
3001 + + + + + _ + 3060 

GACTCCGACCCGCAGATACGACGAAATCCTTAAGACGTTCCCAAGAACCACTACGACCGT 
a LRLGVYAALGILQGFLVMLA - 

GCCATGGCCATGGCAGCGGGTGGCATCCAGGCTGCCCGTGTGTTGCACCAGGCACTGCTG 
3061 + + + + + + 3120 

CGGTACCGGTACCGTCGCCCACCGTAGGTCCGACGGGCACACAACGTGGTCCGTGACGAC 
a AMAMAAGGIGAARVLHQALL - 

CAC AACAAG ATACG CTCGCC AC AGTCCTTCTTTG AC ACC AC ACCATCAG GCCG CATCCTG 

3121 + + + + + + 3180 

GTGTTGTTCTATGCGAGCGGTGTCAGGAAGAAACTGTGGTGTGGTAGTCCGGCGTAGGAC 

a HNKIRSPQSFFDTTPSGRIL- 

AACTGCTTCTCCAAGGACATCTATGTCGTTGATGAGGTTCTGGCCCCTGTCATCCTCATG 

3181 + + + + + + 3240 

TTG ACG AAG AG GTTCCTG TAG AT AC AG CAACT ACTCC AAG A CCG G G G AC AG TAG G AGT AC 

a NCFSKDIYVVD EVLAPVILM - 

CTGCTCAATTCCTTCTTCAACGCCATCTCCACTCTTGTGGTCATCATGG CCAGC ACG CCG 

3241 + + 4- + + + 3300 

G ACG AGTT AAGG AAG AAG TTG CG GT AG AG GTG AG AAC ACC AGTAG T ACCG G TCG TG CGG C 

a LLNSFFNAISTLVVIMASTP - 

CTCTTCACTGTGGTC ATCCTGCCCCTGG CTGTG CTCTACACCTTAGTG CAG CG CTTCTAT 
3301 + + + + + + 3360 

GAGAAGTGACACCAGTAGGACGGGGACCGACACGAGATGTGGAATCACGTCGCGAAGATA 
a LFTVVILPLAVLYTLVQRFY - 

GCAGCCACATCACGGCAACTGAAGCGGCTGGAATCAGTCAGCCGCTCACCTATCTACTCC 
3361 + + + + + 4 3420 
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CGTCGGTGTAGTGCCGTTGACTTCGCCGACCTTAGTCAGTCGGCGAGTGGATAGATGAGG 
a AATSRQLKRLESVSRSPIYS - 



CACTTTTCGGAGACAGTGACTGGTGCCAGTGTCATCCGGGCCTACAACCGCAGCCGGGAT 
3421 + + + + + -t 3480 

GTGAAAAGCCTCTGTCACTGACCACGGTCACAGTAGGCCCGGATGTTGGCGTCGGCCCTA 
a HFStTVTGASVIRAYNRSRD- 



rGAGATCATCAGTGATACTAAGGTGGATGCCAACCAGAGAAGCTGCTACCCCTACATC 
3481 + + + + + + 3540 

AAACTCTAGTAGTCACTATGATTCCACCTACGGTTGGTCTCTTCGACGATGGGGATGTAG 



a FEIISDTKVDANQRSCYP 



Y I 



ATCTCCAACCGGTGGCTGAGCATCGGAGTGGAGTTCGTGGGGAACTGCGTGGTGCTC 
3541 + + + + + + 3 6 oo 

TAGAGGTTGGCCACCGACTCGTAGCCTCACCTCAAGCACCCCTTGACGCACCACGAGAAA 
a ISNRWLSIGVEFVGNCVVLF - 

GCTGCACTATTTGCCGTCATCGGGAGGAGCAGCCTGAACCCGGGGCTGGTGGGCCTTTCT 

3601 + + + + : + + 3660 

CGACGTGATAAACGGCAGTAGCCCTCCTCGTCGGACTTGGGCCCCGACCACCCGGAAAGA 
a AALFAVIGRSSLNPGLVGLS - 

GTGTCCTACTCCTTGCAGGTGACATTTGCTCTGAACTGGATGATACGAATGATGTCAGAT 
3661 + + + + + + 3720 

CACAGGATGAGGAACGTCCACTGTAAACGAGACTTGACCTACTATGCTTACTACAGTCTA 
a VSYSLQVTFALNWMIRMMSO- 

TTGGAATCTAACATCGTGGCTGTGGAGAGGGTCAAGGAGTACTCCAAGACAGAGACAGAG 
3721 + + + + + + 3780 

AACCTTAGATTGTAGCACCGACACCTCTCCCAGTTCCTCATGAGGTTCTGTCTCTGTCTC 
a LESNIVAVERVKEYSKTETE - 

GCGCCCTGGGTGGTGGAAGGCAGCCGCCCTCCCGAAGGTTGGCCCCCACGTGGGGAGGTG 
3781 + + + + + + 3840 

CGCGGGACCCACCACCTTCCGTCGGCGGGAGGGCTTCCAACCGGGGGTGCACCCCTCCAC 
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a APWVVEGSRPPEGWPPRGEV- 

GAG7TCCGGAATTATTCTGTGCGCTACCGGCCGGGCCTAGACCTGGTGCTGAGAGACCTG 
384 1 + + + + + + 3900 

CTCAAGGCCTTAATAAGACACGCG ATGGCCGGCCCGGATCTGGACCACGACTCTCTGGAC 
a EFRNYSVRYRPGLDLVLRDL - 

AGTCTGCATGTGCACGGTGGCGAGAAGGTGGGGATCGTGGGCCGCAGTGGGGCTGGCAAG 
3901 + + + + _„ + + 396O 

TCAGACGTACACGTGCCACCGCTCTTCCACCCCTAGCACCCGGCGTGACCCCGACCGTTC 
a SLHVHGGEKVGIVGRTGAGK - 

TCTTCCATGACCCTTTGCCTGTTCCGCATCCTGGAGGCGGCAAAGGGTGAAATCCGCATT 
3961 + + + + + + 4020 

AG AAG GTACTGG G A^AACG G AC A AG G CGTAG G ACCTCCG CCGTTTCCC ACTTTAG G CGTAA 
a SSMTLCLFRILEAAKGEIRI - 

G ATGGCCTC AATGTG G CAG AC ATCG G CCTCC ATG ACCTGCG CTCTC AG CTG ACC ATC ATC 
4021 + + + + + + 4080 

CTACCGGAGTTACACCGTCTGTAGCCGGAGGTACTGGACGCGAGAGTCGACTGGTAGTAG 
a DGLNVADIG LHDLRSQLTII - 

CCG CAG G ACCCC ATCCTGTTCTCG G GG ACCCTG CG C ATG AACCTG G ACCCCTTCG G CAG C 
4081 + + + + - + + 4140 

GGCGTCCTGGGGTAGGACAAGAGCCCCTGGGACGCGTACTTGGACCTGGGGAAGCCGTCG 
a PQDPILFSGTLRMNLDPFG S - 

TACTCAGAGGAGGACATTTGGTGGGCTTTGGAGCTGTCCCACCTGCACACGTTTGTGAGC 
4141 + + + + + + 4200 

ATG AGTCTCCTCCTG T AAACC ACCCG AAACCTCG AC AG GG TGG ACG TGTG C A A AC ACTCG 
a YSEEDIWWALELSHLHTFVS - 

TCCCAGCCGGCAGGCCTGGACTTCCAGTGCTCAGAGGGCGGGGAGAATCTCAGCGTGGGC 
4201 + + + + - - + + 426O 

AGGGTCGGCCGTCCGGACCTGAAGGTCACGAGTCTCCCGCCCCTCTTAGAGTCGCACCCG 
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a SQPAGLDFQCSEGGENLSVG - 

CAGAGGCAGCTCGTGTGCCTGGCCCGAGCCCTGCTCCGCAAGAGCCGCATCCTGGTTTTA 

4261 + + + + + + 4320 

GTCTCCGTCGAGCACACGGACCGGGCTCGGGACGAGGCGTTCTCGGCGTAGGACCAAAAT 

a QRQLVCLARALLRKSRILVL - 

GACGAGGCCACAGCTGCCATCGACCTGGAGACTGACAACCTCATCGAGGCTACCATCCGC 

4321 + + + + + + 4380 

CTGCTCCGGTGTCGACGGTAGCTGGACCTCTGACTGTTGGAGTAGGTCCGATGGTAGGCG 

a DEATAAIDLETDNLIQATIR - 

ACCC AGTTTG ATACCTGC ACTGTCCTG ACC ATCG C AC ACCG G CTT A AC ACTATC ATG G AC 

4381 + + + + + + 4440 

TGGGTCAAACTATGGACGTGACAGGACTGGTAGCGTGTGGCCGAATTGTGATAGTACCTG 

a TGFDTCTVLTIAHRLNTIMD- 

T AC ACC AG G GTCCTG GTCCTG G AC A AAG G AGT AG TAG CTG A ATTTG ATTCTCC AG CC AAC 

4441 _ + + + + + + 4500 

ATGTGGTCCCAGGACCAGGACCTGTTTCCTCATCATCGACTTAAACTAAGAGGTCGGTTG 

a YTRVLVLDKG VVAEFDSPAN - 

CTC ATTG CAGCTAG AG GCATCTTCT ACG G G ATGG CC AG AG ATG CTG G ACTTGCCTAA 

4501 + + + + + 4557 

G AGTAACGTCG ATCTCCGT AG AAG ATG CCCT ACCG GTCTCT ACG ACCTGAACG G ATT 

a L1AARGIFYGMARDAGLA* - 
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MOAT E cDNA AND AMINO ACID SEQUENCE ENCODED THEREBY 

ATGGCCGCGCCTGCTGAGCCCTGCGCGGGGCAGGGGGTCTGGAACCAGACAGAGCCTGAA 
1 + + + + + + 60 

TACCGGCGCGGACGACTCGGGACGCGCCCCGTCCCCCAGACCTTGGTCTGTCTCGGACTT 
a MAAPAEPCAGQGVWNQTEPE - 

CCTGCCGCCACCAGCCTGCTGAGCCTGTGCTTCCTGAGAACAGCAGGGGTCTGGGTACCC 

6 1 + L. j_ 

+ - + + + + 120 

GGACGGCGGTGGTCGGACGACTCGGACACGAAGGACTCTTGTCGTCCCCAGACCCATGGG 
a PA ATSLLSLCFLRTAGVWVP- 

CCCATGTACCTCTGGGTCCTTGGTCCCATCTACCTCCTCTTCATCCACCACCATGGCCGG 
121 + + + + + + 180 

GGGTACATGGAGACCCAGGAACCAGGGTAGATGGAGGAGAAGTAGGTGGTGGTACCGGCC 
a PMYLWVLGPIYLLFIHHKGR - 

GGCTACCTCCGGATGTCCCCACTCTTCAAAGCCAAGATGGTGCTTGGATTCGCCCTCATA 
181 + + + + + + 240 

CCGATGGAGGCCTACAGGGGTGAGAAGTTTCGGTTCTACCACGAACCTAAGCGGGAGTAT 
a GYLRMSPLFKAKMVLGFALI - 
GTCCTGTGTACCTCCAGCGTGGCTGTCGCTC 

CAGGA<^CATGGAGGTCGCACCGACAGCGAGAAACCTTTTAGGTTGTCCCTTGCGGACTC 



r CTTTGGAAAATCCAACAGGGAACGCCTGAG 
241 + ■ . + + 300 



a VLCTSSVAVALWKIQQGTPE - 

GCCCCAGAATTCCTCATTCATCCTACTGTGTGGCTCACCACGATGAGCTTC6CAGTGTTC 
301 + + + + + + 36Q 

CG GG G TCTTAAG G AGT AAG TAG G ATG AC AC ACCG AG TG G TG CTACTCG AAG CG TC AC A AG 
a Ap EFLIHPTVWLTTMSFAVF - 

CTGATTCACACCGAGAGGAAAAAGGGAGTCCAGTCATCTGGAGTGCTGTTTGGTTACTGG 
361 + + + + + + 42Q 



GACTAAGTGTGGCTCTCCTTTT 



CCCTCAGGTCAGTAG ACCTC ACG AC AAACCAATG ACC 
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a LIHTERKKGVQSSGVLFGYW - 

CTTCTCTGCTTTGTCTTGCCAGCTACCAACGCTGCCCAGCAGGCCTCCGGAGCGGGCTTC 

I H i 

GAAGAGACGAAACAGAACGGTCGATGGTTGCGACGGGTCGTCCGGAGGCCTCGCCCGAAG 
a LLCFVLPATNAAQQASGAG F - 

^CAGAGCG^CCCTGTCCGCCACCTGTCCACCTACCTATGCCTGTCTCTGGTGGTGGCACAG 

+ * + " + + + + 540 

GTCTCGCTGGGACAGGCGGTGGACAGGTGGATGGATACGGACAGAGACCACCACCGTGTC 
QSOPVRHLSTYLCLSLWAQ - 

TTTGTGCTGTCCTGCCTGGCGGATCA^CCCCCCTTCTTCCCTGAAGACCCCCAGCAGTCT 
541 + + + + + + 

AAACACGACAGGACGGACCGCCTAGrTGGGGGGAAGAAGGGACTTCTGGGGGTCGTCAGA 
a FVLS CLADQPP FF p EDpQQS 

A^CCCCTGTCCAGAGACTGGGGCAGCCrrCCCCTCCAAAGCCACGTTCTGGTGGGTTTCT 
+ + + + + + 66Q 

TTGGGGACAGGTCTCTGACCCCGTCGGAAGGGGAGGTTTCGGTGCAAGACCACCCAAAGA 

a npcpetgaafpskatfwwvs - 

^GG^GGTCTGGAGGGGATACAGGAGGCCACTGAGACCAAAAGACCTCTGGTCGCTTGGG 
+ + + + + + 720 

CCG G ACCA G ACCTCCCCTAT G TCCTCC GG TG ACTCT G G TTTTCT G G A G ACC A G CG AACCC 
a GLVWRGYRRPLRPKOLWSLG - 

AGAGAAAACTCCTCAGAAGAACTTGTTTCCCGGCTTGAAAAGGAGTGGATGAGGAACCGC 
".I + + + + + + 780 

TCTCTTTTGAGGAGTCTTCTTGAACAAAGGGCCGAACTTTTCCTCACCTACTCCTTGGCG 



a RENSSEELVSR 



LEKEWMRNR 



781 



AGTGCAGCCCGGAGGCACAACAAGGCAATAGCATTT, 



- + - 



AAAAGGAAAGGCGGCAGTGGCATG 



+ + 840 



TCACGTCGGGCCTCCGTGTTGTTCCGTTATCGTAAA 



TTTTCCTTTCCGCCGTCACCGTAC 
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a SAARRHNKAIAFKRK 



G G S G M 



AAGGCTCCAGAGACCGAGCCCTTCCTACGGCAAGAAGGGAGCCAGTGGCGCCCACTGCTG 
841 + + + . + + + g0Q 

TTCCGAGGTCTCTGGCTCGGGAAGGATGCCGTTCTTCCCTCGGTCACCGCGGGTGACGAC 

a KAPETEPFLRQEGSQWRPLL - 



AAGGCCATCTGGCAGGTGTTCCATTCTACCTTCCTCCTGGGGACCCTCAGCCTCATCATC 
1 " + + + + + + 960 

TTCCGGTAGACCGTCCACAAGGTAAGATGGAAGGAGGACCCCTGGGAGTCGGAGTAGTAG 



a KAIWQVFHSTFLLGTLSLII 

AGTGATGTCTTCAGGTTCACTGTCCCCAAGCTGCTC/ 
961 + + + + + + 

TCACTACAGAAGTCCAAGTGACAGGGGTTCGACGAGTCGGAAAAGGACCTCAAATAACCA 



:agccttttcctggagtttattggt 

1020 



a SDVFRFTVPKLLSLFL 



E F I G 



gatcccaagcctccagcctggaagggctacctcctcgccgtgctgatgttcctctcagcc 

1021 + + + + + + 1080 

ctagggttcggaggtcggaccttcccgatggaggagcggcacgactacaaggagagtcgg 

a DPKPPAWKGYLLAVLMFLSA - 

tgcctgcaaacgctgtttgagcagcagaacatgtacaggctcaaggtgccgcagatgagg 

1081 + + + + + + 1H0 

acggacgtttgcgacaaactcgtcgtcttgtacatgtccgagttccacggcgtctactcc 

a CLQTLFEQQNMYRLKVPQMR - 

ttgcggtcggccatcactggcctggtgtacagaaaggtcctggctctgtccagcggctcc 

1141 + + + + + + 1200 

aacgccagccggtagtgaccggaccacatgtctttccaggaccgagacaggtcgccgagg 

a LRSAITGLVYRKVLALSSG S - 

agaaaggccagtgcggtgggtgatgtggtcaatctggtgtccgtggacgtgcagcggctg 

1201 + + + + + + 126Q 

TCTTTCCGGTCACGCCACCCACTACACCAGTTAGACCACAGGCACCTGCACGTCGCCGAC 
a RKASAVGOVVNLVSVDVQRL - 
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ACCGAGAGCGTCCTCTACCTCAACGGGCTGTGGCTGCCTCTCGTCTGGATCGTGGTCTGC 
1261 + + + + + + 1220 

TGGCTCTCGCAGGAGATGGAGTTGCCCGACACCGACGGAG AGCAGACCTAGCACCAGACG 
a TESVLYLNGLWLPLVWIVVC- 

TTCGTCTATCTCTGGCAGCTCCTGGGGCCCTCCGCCCTCACTGCCATCGCTGTCTTCCTG 
1321 + + + + + ~ + 1380 

AAGCAGATAGAGACCGTCGAGGACCCCGGGAGGCGGGAGTGACGGTAGCGACAGAAGGAC 
a FVYLWOLLGPSALTAIAVFL- 

AG CCTCCTCCCTCTG A ATTTCTTC ATCTCC AAG AA A AGG A ACC ACC ATC AG G AG G AG C AA 
1381 + + + + + + 1440 

TCGGAGGAGGGAGACTTAAAGAAGTAGAGGTTCTTTTCCTTGGTGGTAGTCCTCCTCGTT 
a SLLPLNFFISKKRNHHQEEQ- 

ATGAGGCAGAAGGACTCACGGGCACGGCTCACCAGCTCTATCCTCAGGAACTCGAAGACC 
1441 + + -4- 4- + + 1500 

TACTCCGTCTrCCTGAGTGCCCGTGCCGAGTGGTCGAGATAGGAGTCCTTGAGCTTCTGG 
a MRQKDSRARLTSSILRNSKT- 

ATC AAGTTCCATG G CTG G G AG G G AG CCTTTCTG G AC AG AG TCCTG G G C ATCCG AG G CCAG 
1501 + + — + + + + 1560 

TAGTTCAAGGTACCGACCCTCCCTCGGAAAGACCTGTCTCAGGACCCGTAGGCTCCGGTC 

a IKFHGWEGAFLDRVLGIRGQ - 

GAGCTGGGCGCCTTGCGGACCTCCGGCCrrc 
1561 + + + + 4- + 1620 

CTCGACCCGCGGAACGCCTGGAGGCCGGAGGAGAAGAGACACAGCGACCACAGGAAGGTT 
a ELGALRTSGLLFSVSLVSFQ - 

GTGTCTACATTTCTGGTCGCACTGGTGGTGTTTGCTGTCCACACTCTGGTGGCCGAGAAT 
1621 + 4- + + + 4- 1680 

C ACAG ATGTAAAG ACC AG CGTG ACC ACC AC AAACG AC AG G TGTG AG ACC ACCGG CTCTT A 
a VSTFLVALVVFA. VHTLVAEN - 
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GCTATGAATGCAGAGAAAGCCTTTGTGACTCTCACAGTTCTCAACATCCTCAACAAGGCC 
1 68 1 + + + — + + + 1 740 

CGATACTTACGTCTCTTTCGGAAACACTGAGAGTGTCAAGAGTTGTAGGAGTTGTTCCGG 
a AMNAEKAFVT L T V L N I L N K A - 

CAGGCTTTCCTGCCCTTCTCCATCCACTCCCTCGTCCAGGCCCGGGTGTCCTTTGACCGT 
1 74 1 + + + + + ... ! QOO 

GTCCGAAAGGACGGGAAGAGGTAGGTGAGGGAGCAGGTCCGGGCCCACAGGAAACTGGCA 
a QAFLPFSIHSLVQARVSFDR - 

CTG GTC ACCTTCCTCTG CCTG G AAG AAG TTG ACCCTG G TG TCGT AG ACTC AAG TTCCTCT 

1801 + + + + + + -i860 

G ACC AGTG G AAG G AG ACG G ACCTTCTTC AACTG G G ACC AC AGC ATCTG AGTTC AAG G AG A 

a LVTFLCLEEVDPGVVDSSSS - 

G G AAG CG CTG CCG G G AAG GATTG CATC ACC ATAC AC AGTG CCACCTTCG CCTG GTCCC AG 
1861 + + + + + + 1920 

CCTTCGCGACGGCCCTTCCTAACGTAGTGGTATGTGTCACGGTGGAAGCGGACCAGGGTC 

a GSAAGKDCITIHSATFAWSQ - 

G AAAGCCCTCCCTG CCTCC AC AG AAT AAACCTC ACG GTG C CCC AG G G CTGTCTG CTG G CT 

1921 + + + + + + 1980 

CTTTCGGGAGGGACGGAGGTGTCTTATTTGGAGTGCCACGGGGTCCCGACAGACGACCGA 

a ESPPCLHFMNLTVPGGCLLA - 

GTTGTCGGTCCAGTGGGGGCAGGGAAGTCCTCCCTGCTGTCCGCCCTCCTTGGGGAGCTG 
1981 4- + + 4- + + 2040 

CAACAGCCAGGTCACCCCCGTCCCTTCAGGAGGGACGACAGGCGGGAGGAACCCCTCGAC 
a VVGPVGAGKSSLLSALLGEL - 

TCAAAGGTGGAGGGGTTCGTGAGCATCGAGGGTGCTGTGGCCTACGTGCCCCAGGAGGCC 
2041 + 4 + 4 + + 2100 

AGTTTCCACCTCCCCAAGCACTCGTAGCTCCCACGACACCGGATGCACGGGGTCCTCCGG 
a SKVEGFVSIEGAVAYVPGEA - 

TGGGTGCAGAACACCTCTGTGGTAGAGAATGTGTGCTTCGGGCAGGAGCTGGACCCACCC 
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2101 _ + _ + + . + + + 216Q 

ACCCACGTCTTGTGGAGACACCATCTCTTACACACGAAGCCCGTCCTCGACCTGGGTGGG 
a WVQNTSVVENVCFGQELOPP- 

TGGCTGGAGAGAGTACTAGAAGCCTGTGCCCTGCAGCCAGATGTGGACAGCTTCCCTGAG 

2161 -( L. . . _ 

ACCGACCTCTCTCATGATCTTCGGACACGGGACGTCGGTCTACACCTGTCGAAGGGACTC 
a WUERVLEACALQPDVDSFPE - 

GGAATCCACACTTCAATTGGGGAGCAGGGCATGAATCTCTCCGGAGGCCAGAAGCAGCGG 
2221 + + + + + + 2280 

CCTTAGGTGTGAAGTTAACCCCTCGTCCCGTACTTAGAGAGGCCTCCGGTCTTCGTCGCC 
a GIHTSIGEQGMNLSGGQKQR - 



CTGAGCCTGGCCCGGGCTGTATACAGAAAGGCAGCTGTGTACCTGCTGG/ 

GGGAC 



oATGACCCCCTG 

2281 + + + + + + 234Q 

GACTCGGACCGGGCCCGACATATGTCrrrCCGTCGACACATGGACGACCTACTGG 



a LSLARAVYRKAAVYLLDDPL - 

GCGGCCCTGGATGCCCACGTTGGCCAGCATGTCTTCAACCAGGTCATTGGGCCTGGTGGG 
2341 + + + + + + 240Q 

CGCCGGGACCTACGGGTGCAACCGGTCGTACAGAAGTTGGTCCAGTAACCCGGACCACCC 

a AALDAHVGQHVFNQVIGPGG - 

_ ^^^^^^^^^^^^^^^^^^^^^^^^^^CACGCACTCCACATCCTGCCCCAGGCT 
2401 + + + •+■ + + 2460 

GATGAGGTCCC-ITGTTGTGCCTAAGAGCACTGCGTGCGTGAGGTGTAGGACGGGGTCCGA 

a llqgttrilvthalhilpqa - 

GATTGGATCATAGTGCTGGCAAATGGGGCCATCGCAGAGATGGGTTCCTACCAGGAGCTT 
2461 + + + + + + 2g2o 

CTAACCTAGTATCACGACCGTTTACCCCGGTAGCGTCTCTACCCAAGGATGGTCCTCGAA 

a DWllv/LANGAIAEMGSYQEL - 



CTGCAGAGGAAGGGGGCCCTCGTGTGTCTTCTGGATCAAGCCAGACAGCCAGGAGATAGA 
2521 + + + + + + 258o 
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GACGTCTCCTTCCCCCGGGAGCACACAGAA6ACGTAGTTCGGTCTGTCGGTCCTCTATCT 
a LQRKGALVCLLDQARQPG D R - 

GGAGAAGGAGAAACAGAACCTGGGACCAGCACCAAGGACCCCAGAGGCACCTCTGCAGGC 
2581 + — + + — + _„ + + 2640 

CCTCTTCCTCTTTGTCTTGGACCCTGGTCGTGGTTCCTGGGGTCTCCGTGGAGACGTCCG 
a GEGETEPGTSTKDPRGTSAG - 

AGGAGGCCCGAGCTTAGACGCGAGAGGTCCATCAAGTCAGTCCCTGAGAAGGACCGTACC 
2641 + + 4- + + + 2700 

TCCTCCG GGCTCG AATCTG CG CTCTCC AG GTAGTTC AG TC AG G G ACTCTTCCTG G C ATG G 
a RRPELRRERSIKSVPEKDRT - 

ACTTCAGAAGCCCAGACAGAGGTTCCTCTGGATGACCCTGACAGGGCAGGATGGCCAGCA 
2701 + + + + + 4- 2760 

TGAAGTCTTCGGGTCTGTCTCCAAGGAGACCTACTGGGACTGTCCCGTCCTACCGGTCGT 
a TSEAQTEVPLDDPDRAGWPA - 

GGAAAGGACAGCATCCAATACGGCAGGGTGAAGGCCACAGTGCACCTGGCCTACCTGCGT 
2761 + + + + + + 2820 

CCTTTCCTGTCGTAGGTTATGCCGTCCCACTTCCGGTGTCACGTGGACCGGATGGACGCA 
a GKDSIQYGRVKATVHLAYLR - 

GCCGTGG G CACCCCCCTCTG CCTCTACGC ACTCTTCCTCTTCCTCTG CC AGC AAGTG G CC 
2821 + + + + + + 2880 

CGGCACCCGTGGGGGGAGACGGAGATGCGTGAGAAGGAGAAGGAGACGGTCGTTCACCGG 
a AVGTPLCLYALFLFLCQQVA - 

TCCTTCTGCCGGGGCTACTGGCTGAGCCTGTGGGCGGACGACCCTGCAGTAGGTGGGCAG 
2881 + + + + + + 2940 

AGGAAGACGGCCCCGATGACCGACTCGGACACCCGCCTGCTGGGACGTCATCCACCCGTC 
a SFCRGYWLSLWADDPAVGGQ 

CAGACGCAGGCAGCCCTGCGTGGCGGGATCTTCGGGCTCCTCGGCTGTCTCCAAGCCATT 
2941 + 4- + + + + 3000 

GTCTGCGTCCGTCGGGACGCACCGCCCTAGAAGCCCGAGGAGCCGACAGAGGTTCGGTAA 
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a Q T Q A A L R G GIFGLLGCLQAI 

:aggttgctc 



GGGCTGTTTGCCTCCATGGCTGCGGTGCTCCTAGGTGGGGCCCGGGCATCC/ 



3001 + - + + + + + 306Q 

CCCGACAAACGGAGGTACCGACGCCACGAGGATCCACCCCGGGCCCGTAGGTCCAACGAG 



a GLFASMAAVLLGGARASRLL 

:tttgagcggacaccc 



TTCCAGAGGCTCCTGTGGGATGTGGTGCGATCTCCCATCAGCTTC1 



3061 + + + + + + 312Q 

AAGGTCTCCGAGGACACCCTACACCACGCTAGAGGGTAGTCGAAGAAACTCGCCTGTGGG 
a FQRLLWDVVRSPISFFERTP - 

ATTGGTCACCTGCTAAACCGCTTCTCCAAGGAGACAGACACGGTTGACGTGGACATTCCA 
3121 + + + + + + 3180 

TAACCAGTGGACGATTTGGCGAAGAGGTTCCTCTGTCTGTGCCAACTGCACCTGTAAGGT 
a IGHLLNRFSKETDTVDVDIP - 

GACAAACTCCGGTCCCTGCTGATGTACGCCTTTGGACTCCTGGAGGTCAGCCTGGTGGTG 
3181 + + + _ + + + 324Q 

CTGTTTGAGGCCAGGGACGACTACATGCGGAAACCTGAGGACCTCCAGTCGGACCACCAC 
a DKLRSLLMYAFGLLEVSLVV - 

GCAGTGGarACCCCACTGGCCACTGTGGCCATCCTGCCACTGTrrCTCCTCTACGCTGGG 
3241 + + + + + + 3300 

CGTCACCGATGGGGTGACCGGTGACACCGGTAGGACGGTGACAAAGAGGAGATGCGACCC 
a AVATPLATVAILPLFLLYAG - 

TTTCAGAGCCTGTATGTGGTTAGCTCATGCCAGCTGAGACGCTTGGAGTCAGCCAGCTAC 
3301 + + + + + + 3360 

AAAGTCTCGGACATACACCAATCGAGTACGGTCGACTCTGCGAACCTCAGTCGGTCGATG 
a FQSLYVVSSCQLRRLESASY - 

TCGTCTGTCTGCTCCCACATGGCTGAGACGTTCCAGGGCAGCACAGTGGTCCGGGCATTC 
3361 + * ■+ 3420 



AGCAGACAGACGAGGGTGTACCGACTCTGCAAGGTCCCGTCGTGTCACC 



AGGCCCGTAAG 



Figure 15H 

SUBSTITUTE SHEET (RULE 26) 



WO 99/49735 



PCT/US99/06644 



54/56 



a SSVCSHMAETFQGSTVVRAF - 

CGAACCCAGGCCCCTCTTGTGGCTCAGAACAATGCTCGCGTAGATGAAAGCCAGAGGATC 
3421 + + + + + + 3480 

GCTTGGGTCCGGGGAGAACACCGAGTCTTGTTACGAGCGCATCTACTTTCGGTCTCCTAG 
a RTQAPLVAQ NNARVDESQRI - 

AGTTTCCCGCGACTGGTGGCTGACAGGTGGCTTGCGGCCAATGTGGAGCTCCTGGGGAAT 
3481 + + + + + + 3540 

TCAAAGGGCGCTGACCACCGACTGTCCACCGAACGCCGGTTACACCTCGAGGACCCCTTA 
a SFPRLVADRWLAANVELLGN - 

GGCCTGGTGTTTGCAGCTGCCACGTGTGCTGTGCTGAGCAAAGCCCACCTCAGTGCTGGC 
3541 + + + + + + 3600 

CCGGACCACAAACGTCGACGGTGCACACGACACGACTCGTTTCGGGTGGAGTCACGACCG 

a GLVFAAATCAVLSKAHLSAG - 

CTCGTGGGCTTCTCTGTCTCTGCTGCCCTCCAGGTGACCCAGGCACTGCAGTGGGTTGTT 

3601 + + + + 4- + 3660 

GAGCACCCGAAGAGACAGAGACGACGGGAGGTCCACTGGGTCCGTGACGTCACCCAACAA 

a LVGFSVSAALQVTQALQWVV - 

CGCAACTG G ACAG ACCT AG AG AACAG C ATCGTGTC AGTG G AG CG G ATGC AG G ACTATGCC 

3661 + + + + - + + 3720 

G CGTTG ACCTGTCTG G ATCTCTTGTCGTAG CAC AGTC ACCTCG CCT ACGTCCTG AT ACGG 

a RNWTOLENSIVSVERMQOYA- 

TG G ACG CCC AAG G AG G CTCCCTG G AG G CTG CCC AC ATGTG C AG CTC AG CCCCCCTG G CCT 

3721 + + + + + + 3760 

ACCTGCG GGTTCCTCCG AG G G ACCTCCG ACG G GTG T AC ACG TCG AG TCG G G G G G ACCGG A 

a WTPKEAPWRLPTCAAQPPWP - 

CAGGGCGGGCAGATCGAGTTCCGGGACTTTGGGCTAAGATACCGACCTGAGCTCCCGCTG 

3781 + + + + r + + 3840 

GTCCCGCCCGTCTAGCTCAAGQCCCTGAAACCCGATTCTATGGCTGGACTCGAGGGCGAC 

a QGGQIEFRDFGLRYRPELPL - 
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GCTGTGCAGGGCGTGTCCCTCAAGATCCACGCAGGAGAG AAGGTGGGCATCGTTGGCAGG 
3841 + + + + + + 3goo 

CGACACGTCCCGCACAGGGAGTTCTAGGTGCGTCCTCTCTTCCACCCGTAGCAACCGTCC 
a AVQGVSLKIHAGEKVGIVGR 

ACCGGGGCAGGGAAGTCCTCCCTGGCCAGTGGGCTGCTGCGGCTCCAGGAGGCAGCTGAG 

3901 + + + + „ — + + 3960 

TGGCCCCGTCCCTTCAGGAGGGACCGGTCACCCGACGACGCCGAGGTCCTCCGTCGACTC 
a TGAGKSSLASGLLRLQEAAE - 

GGTGGGATCTGGATCGACGGGGTCCCCATTGCCCACGTGGGGCTGCACACACTGCGCTCC 
3961 + + + + + + 4020 

CCACCCTAGACCTAGCTGCCCCAGGGGTAACGGGTGCACCCCGACGTGTGTGACGCGAGG 
a GGIWIDGVPIAHVGLHTLRS - 

AGGATCAGCATCATCCCCCAGGACCCCATCCTGTTCCCTGGCTCTCTGCGGATGAACCTC 
4021 + + + + + + 4080 

TCCTAGTCGTAGTAGGGGGTCCTGGGGTAGGACAAGGGACCGAGAGACGCCTACTTGGAG 
a RISIIPQDPILFPGSLRMNL- 

GACCTGCTGCAGGAGCACTCGGACGAGGCTATCTGGGCAGCCCTGGAGACGGTGCAGCTC 
4081 + + + + + + 4140 

CTGGACGACGTCCTCGTGAGCCTGCTCCGATAGACCCGTCGGGACCTCTGCCACGTCGAG 
a DLLQEHSDEAIWAALETVQL - 

AAAGCCTTGGTGGCCAGCCTGCCCGGCCAGCTGCAGTACAAGTGTGCTGACCGAGGCGAG 
4141 + + + + ; + + 4200 

TTTCGGAACCACCGGTCGGACGGGCCGGTCGACGTCATGTTCACACGACTGGCTCCGCTC 
a KALVASLPGQLQYKCADRG E - 



GACCTGAGCGTGGGCCAGAAACAGCTCCTGTGTCTGGCACGTGCCC7TCTCCGGAAGACC 
}1 " + + + + + + 4260 

CTGGACTCGCACCCGGTCTTTGTCGAGGACACAGACCGTGCACGGGAAGAGGCCTTCTGG 



a DLSVGQKQLLCI AR 



A L L R K T - 
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CAGATCCTCATCCTGGACGAGGCTACTGCTGCCGTGGACCCTGGCACGGAGCTGCAGATG 
4261 + + + + + + 4320 

GTCTAGGAGTAGGACCTGCTCCGATGACGACGGCACCTGGGACCGTGCCTCGACGTCTAC 
a QILILDEATAAVDPGTELQM - 

CAGGCCATGCTCGGGAGCTGGTTTGCACAGTGCACTGTGCTGCTCATTGCCCACCGCCTG 
4321 + + + + + + 4380 

GTCCGGTACGAGCCCTCGACCAAACGTGTCACGTGACACGACGAGTAACGGGTGGCGGAC 
a QAMLG SWFAQCTVLLIA.HRL - 

CGCTCCGTGATGGACTGTGCCCGGGTTCTGGTCATGGACAAGGGGCAGGTGGCAGAGAGC 
4381 + + + + + + 4440 

GCGAGGCACTACCTGACACGGGCCCAAGACCAGTACCTGTTCCCCGTCCACCGTCTCTCG 
a RSVMDCARVLVMDKGQVAES- 

GGCAGCCCGGCCCAGCTGCTGGCCCAGAAGGGCCTGTTTTACAGACTGGCCCAGGAGTCA 
4441 + — + + + + + 4500 

CCGTCGGGCCGGGTCGACGACCGGGTCTTCCCGGACAAAATGTCTGACCGGGTCCTCAGT 

a GSPAGLLAQKGuFYRLAGES - 

GGCCTGGTCTGA 

4501 +- 4512 

CCG G ACCAG ACT 

a G L V * - 
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SEQUENCE LISTING 

<110> Fox Chase Cancer Center 
Kruh, Gary D. 
Lee , Kun 

Belinsky, Martin G. 
Bain, Lisa J. 

<120> MRP-Related ABC Transporter Encoding 
Nucleic Acids and Methods of Use Thereof 



<130> FCCC 98-02 

<150> 60/079,759 
<151> 1998-03-27 

<150> 60/095,153 
<151> 1998-08-03 

<160> 18 

<170> FastSEQ for Windows Version 3.0 

<210> 1 
<211> 4231 
<212> DNA 

<213> Homo sapiens 
<400> 1 

ggacaggcgt ggcggccgga gccccagcat ccctgcttga ggtccaggag cggagcccgc 60 

ggccaccgcc gcctgatcag cgcgaccccg gcccgcgccc gccccgcccg gcaagatgct 120 

gcccgtgtac caggaggtga agcccaaccc gctgcaggac gcgaacatct gctcacgcgt 180 

gttcttctgg tggctcaatc ccttgtttaa aattggccat aaacggagat tagaggaaga 240 

tgatatgtat tcagtgctgc cagaagaccg ctcacagcac cttggagagg agttgcaagg 300 

gttctgggat aaagaagttt taagagctga gaatgacgca cagaagcctt ctttaacaag 3 60 

agcaatcata aagtgttact ggaaatctta tttagttttg ggaattttta cgttaattga 420 

ggaaagtgcc aaagtaatcc agcccatatt tttgggaaaa attattaatt attttgaaaa 480 

ttatgatccc atggattctg tggctttgaa cacagcgtac gcctatgcca cggtgctgac 540 

tttttgcacg ctcattttgg ctatactgca tcacttatat ttttatcacg ttcagtgtgc 600 

tgggatgagg ttacgagtag ccatgtgcca tatgatttat cggaaggcac ttcgtcttag 660 

taacatggcc atggggaaga caaccacagg ccagatagtc aatctgctgt ccaatgatgt 720 

gaacaagttt gatcaggtga cagtgttctt acacttcctg tgggcaggac cactgcaggc 7 80 

gatcgcagtg actgccctac tctggatgga gataggaata tcgtgccttg ctgggatggc 840 

agttctaatc attctcctgc ccttgcaaag ctgttttggg aagttgttct catcactgag 900 

gagtaaaact gcaactttca cggatgccag gatcaggacc atgaatgaag ttataactgg 9 60 

tataaggata ataaaaatgt acgcctggga aaagtcattt tcaaatctta ttaccaattt 1020 

gagaaagaag gagatttcca agattctgag aagttcctgc ctcaggggga tgaatttggc 1080 

ttcgtttttc agtgcaagca aaatcatcgt gtttgtgacc ttcaccacct acgtgctcct 1140 

cggcagtgtg atcacagcca gccgcgtgtt cgtggcagtg acgctgtatg gggctgtgcg 1200 

gctgacggtt accctcttct tcccctcagc cattgagagg gtgtcagagg caatcgtcag 12 60 

catccgaaga atccagacct ttttgctact tgatgagata tcacagcgca accgtcagct 1320 

gccgtcagat ggtaaaaaga tggtgcatgt gcaggatttt actgcttttt gggataaggc 1380 

atcagagacc ccaactctac aaggcctttc ctttactgtc agacctggcg aattgttagc 1440 

tgtggtcggc cccgtgggag cagggaagtc atcactgtta agtgccgtgc tcggggaatt 1500 

ggccccaagt cacgggctgg tcagcgtgca tggaagaatt gcctatgtgt ctcagcagcc 15 60 

ctgggtgttc tcgggaactc tgaggagtaa tattttattt gggaagaaat atgaaaagga 1620 

acgatatgaa aaagtcataa aggcttgtgc tctgaaaaag gatttacagc tgttggagga 1680 

tggtgatctg actgtgatag gagatcgggg aaccacgctg agtggagggc agaaagcacg 1740 

ggtaaacctt gcaagagcag tgtatcaaga tgctgacatc tatctcctgg acgatcctct 1800 

cagtgcagta gatgcggaag ttagcagaca cttgttcgaa ctgtgtattt gtcaaatttt 1860 

gcatgagaag atcacaattt tagtgactca tcagttgcag tacctcaaag ctgcaagtca 1920 

gattctgata ttgaaagatg gtaaaatggt gcagaagggg acttacactg agttcctaaa 1980 

atctggtata gattttggct cccttttaaa gaaggataat gaggaaagtg aacaacctcc 2040 

agttccagga actcccacac taaggaatcg taccttctca gagtcttcgg tttggtctca 2100 
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acaatcttct agaccctcct tgaaagatgg tgctctggag agccaagata cagagaatgt 2160 

cccagttaca ctatcagagg agaaccgttc tgaaggaaaa gttggttttc aggcctataa 2220 

gaattacttc agagctggtg ctcactggat tgtcttcatt ttccttattc tcctaaacac 2280 

tgcagctcag gttgcctatg tgcttcaaga ttggtggctt tcatactggg caaacaaaca 2340 

aagtatgcta aatgtcactg taaatggagg aggaaatgta accgagaagc tagatcttaa 2400 

ctggtactta ggaatttatt caggtttaac tgtagctacc gttctttttg gcatagcaag 2460 

atctctattg gtattctacg tccttgttaa ctcttcacaa actttgcaca acaaaatgtt 2520 

tgagtcaatt ctgaaagctc cggtattatt ctttgataga aatccaatag gaagaatttt 2580 

aaatcgtttc tccaaagaca ttggacactt ggatgatttg ctgccgctga cgtttttaga 2640 

tttcatccag acattgctac aagtggttgg tgtggtctct gtggctgtgg ccgtgattcc 2700 

ttggatcgca atacccttgg ttccccttgg aatcattttc atttttcttc ggcgatattt 2760 

tttggaaacg tcaagagatg tgaagcgcct ggaatctaca actcggagtc cagtgttttc 2820 

ccacttgtca tcttctctcc aggggctctg gaccatccgg gcatacaaag cagaagagag 2880 

gtgtcaggaa ctgtttgatg cacaccagga tttacattca gaggcttggt tcttgttttt 2940 

gacaacgtcc cgctggttcg ccgtccgtct ggatgccatc tgtgccatgt ttgtcatcat 3000 

cgttgccttt gggtccctga ttctggcaaa aactctggat gccgggcagg ttggtttggc 3060 

actgtcctat gccctcacgc tcatggggat gtttcagtgg tgtgttcgac aaagtgctga 3120 

agttgagaat atgatgatct cagtagaaag ggtcattgaa tacacagacc ttgaaaaaga 3180 

agcaccttgg gaatatcaga aacgcccacc accagcctgg ccccatgaag gagtgataat 3240 

ctttgacaat gtgaacttca tgtacagtcc aggtgggcct ctggtactga agcatctgac 3300 

agcactcatt aaatcacaag aaaaggttgg cattgtggga agaaccggag ctggaaaaag 33 60 

ttccctcatc tcagcccttt ttagattgtc agaacccgaa ggtaaaattt ggattgataa 3420 

gatcttgaca actgaaattg gacttcacga tttaaggaag aaaatgtcaa tcatacctca 3480 

ggaacctgtt ttgttcactg gaacaatgag gaaaaacctg gatcccttta aggagcacac 3540 

ggatgaggaa ctgtggaatg ccttacaaga ggtacaactt aaagaaacca ttgaagatct 3 600 

tcctggtaaa atggatactg aattagcaga atcaggatcc aattttagtg ttggacaaag 3660 

acaactggtg tgccttgcca gggcaattct caggaaaaat cagatattga ttattgatga 3720 

agcgacggca aatgtggatc caagaactga tgagttaata caaaaaaaaa tccgggagaa 3780 

atttgcccac tgcaccgtgc taaccattgc acacagattg aacaccatta ttgacagcga 3840 

caagataatg gttttagatt caggaagact gaaagaatat gatgagccgt atgttttgct 3900 

gcaaaataaa gagagcctat tttacaagat ggtgcaacaa ctgggcaagg cagaagccgc 3 9 60 

tgccctcact gaaacagcaa aacaggtata cttcaaaaga aattatccac atattggtca 4020 

cactgaccac atggttacaa acacttccaa tggacagccc tcgaccttaa ctattttcga 4080 

gacagcactg tgaatccaac caaaatgtca agtccgttcc gaaggcattt tccactagtt 4140 

tttggactat gtaaaccaca ttgtactttt ttttactttg gcaacaaata tttatacata 4200 

caagatgcta gttcatttga atatttctcc c 4231 

<210> 2 

<211> 1325 

<212> PRT 

<213> Homo sapiens 

<400> 2 
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180 

Gin lie Val Asn 
195 

Thr Val Phe Leu 
210 

Val Thr Ala Leu 
225 

Met Ala Val Leu 

Leu Phe Ser Ser 
260 

lie Arg Thr Met 
275 

Tyr Ala Trp Glu 
290 

Lys Glu lie Ser 
305 

Leu Ala Ser Phe 

Thr Thr Tyr Val 
340 

Val Ala Val Thr 
355 

Phe Pro Ser Ala 
370 

Arg lie Gin Thr 
385 

Gin Leu Pro Ser 

Ala Phe Trp Asp 
420 

Phe Thr Val Arg 
435 

Ala Gly Lys Ser 
450 

Ser His Gly Leu 
465 

Gin Pro Trp Val 

Lys Lys Tyr Glu 
500 

Leu Lys Lys Asp 
515 

Gly Asp Arg Gly 
530 

Leu Ala Arg Ala 
545 

Pro Leu Ser Ala 

Cys He Cys Gin 
580 

Gin Leu Gin Tyr 
595 

Gly Lys Met Val 
610 

He Asp Phe Gly 
625 

Pro Pro Val Pro 

Ser Ser Val Trp 
660 

Ala Leu Glu Ser 
675 

Glu Asn Arg Ser 
690 

Phe Arg Ala Gly 



Leu Leu Ser Asn 
200 

His Phe Leu Trp 
215 

Leu Trp Met Glu 
230 

He He Leu Leu 
245 

Leu Arg Ser Lys 

Asn Glu Val He 
280 

Lys Ser Phe Ser 
295 

Lys He Leu Arg 
310 

Phe Ser Ala Ser 
325 

Leu Leu Gly Ser 

Leu Tyr Gly Ala 
360 

He Glu Arg Val 
375 

Phe Leu Leu Leu 
390 

Asp Gly Lys Lys 
405 

Lys Ala Ser Glu 

Pro Gly Glu Leu 
440 

Ser Leu Leu Ser 
455 

Val Ser Val His 
470 

Phe Ser Gly Thr 
485 

Lys Glu Arg Tyr 

Leu Gin Leu Leu 
520 

Thr Pro Leu Ser 
535 

Val Tyr Gin Asp 
550 

Val Asp Ala Glu 
565 

He Leu His Glu 

Leu Lys Ala Ala 
600 

Gin Lys Gly Thr 
615 

Ser Leu Leu Lys 
630 

Gly Thr Pro Thr 
645 

Ser Gin Gin Ser 

Gin Asp Thr Glu 
680 

Glu Gly Lys Val 
695 

Ala His Trp He 



3/19 

185 

Asp Val Asn Lys 

Ala Gly Pro Leu 
220 

He Gly He Ser 
235 

Pro Leu Gin Ser 
250 

Thr Ala Thr Phe 
265 

Thr Gly He Arg 

Asn Leu He Thr 
300 

Ser Ser Cys Leu 
315 

Lys He He Val 
330 

Val He Thr Ala 
345 

Val Arg Leu Thr 

Ser Glu Ala He 
380 

Asp Glu He Ser 
395 

Met Val His Val 
410 

Thr Pro Thr Leu 
425 

Leu Ala Val Val 

Ala Val Leu Gly 
460 

Gly Arg He Ala 
475 

Leu Arg Ser Asn 
490 

Glu Lys Val He 
505 

Glu Asp Gly Asp 

Gly Gly Gin Lys 
540 

Ala Asp He Tyr 
555 

Val Ser Arg His 
570 

Lys He Thr He 
585 

Ser Gin He Leu 

Tyr Thr Glu Phe 
620 

Lys Asp Asn Glu 
635 

Leu Arg Asn Arg 
650 

Ser Arg Pro Ser 
665 

Asn Val Pro Val 

Gly Phe Gin Ala 
700 

Val Phe He Phe 



190 

Phe Asp Gin Val 
205 

Gin Ala He Ala 

Cys Leu Ala Gly 
240 

Cys Phe Gly Lys 
255 

Thr Asp Ala Arg 
270 

He He Lys Met 
285 

Asn Leu Arg Lys 

Arg Gly Met Asn 
320 

Phe Val Thr Phe 
335 

Ser Arg Val Phe 
350 

Val Thr Leu Phe 
365 

Val Ser He Arg 

Gin Arg Asn Arg 
400 

Gin Asp Phe Thr 
415 

Gin Gly Leu Ser 
430 

Gly Pro Val Gly 
445 

Glu Leu Ala Pro 

Tyr Val Ser Gin 
480 

He Leu Phe Gly 
495 

Lys Ala Cys Ala 
510 

Leu Thr Val He 
525 

Ala Arg Val Asn 

Leu Leu Asp Asp 
560 

Leu Phe Glu Leu 
575 

Leu Val Thr His 
590 

He Leu Lys Asp 
605 

Leu Lys Ser Gly 

Glu Ser Glu Gin 
640 

Thr Phe Ser Glu 
655 

Leu Lys Asp Gly 
670 

Thr Leu Ser Glu 
685 

Tyr Lys Asn Tyr 
Leu He Leu Leu 
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705 710 715 720 

Asn Thr Ala Ala Gin Val Ala Tyr Val Leu Gin Asp Trp Trp Leu Ser 

725 730 735 

Tyr Trp Ala Asn Lys Gin Ser Met Leu Asn Val Thr Val Asn Gly Gly 

740 745 750 

Gly Asn Val Thr Glu Lys Leu Asp Leu Asn Trp Tyr Leu Gly He Tvr 

755 760 765 

Ser Gly Leu Thr Val Ala Thr Val Leu Phe Gly He Ala Arg Ser Leu 

770 775 780 

Leu Val Phe Tyr Val Leu Val Asn Ser Ser Gin Thr Leu His Asn Lys 
785 790 795 800 

Met Phe Glu Ser He Leu Lys Ala Pro Val Leu Phe Phe Asp Arg Asn 

805 810 815 

Pro He Gly Arg He Leu Asn Arg Phe Ser Lys Asp He Gly His Leu 

820 825 830 

Asp Asp Leu Leu Pro Leu Thr Phe Leu Asp Phe He Gin Thr Leu Leu 

835 840 845 

Gin Val Val Gly Val Val Ser Val Ala Val Ala Val He Pro Trp He 

850 855 860 

Ala He Pro Leu Val Pro Leu Gly He He Phe He Phe Leu Arg Arg 
865 870 875 880 

Tyr Phe Leu Glu Thr Ser Arg Asp Val Lys Arg Leu Glu Ser Thr Thr 

885 890 895 

Arg Ser Pro Val Phe Ser His Leu Ser Ser Ser Leu Gin Gly Leu Trp 

900 905 910 

Thr He Arg Ala Tyr Lys Ala Glu Glu Arg Cys Gin Glu Leu Phe Asp 

915 920 925 

Ala His Gin Asp Leu His Ser Glu Ala Trp Phe Leu Phe Leu Thr Thr 

930 935 940 

Ser Arg Trp Phe Ala Val Arg Leu Asp Ala He Cys Ala Met Phe Val 
945 950 955 960 

He He Val Ala Phe Gly Ser Leu He Leu Ala Lys Thr Leu Asp Ala 

965 970 975 

Gly Gin Val Gly Leu Ala Leu Ser Tyr Ala Leu Thr Leu Met Gly Met 

980 985 990 

Phe Gin Trp Cys Val Arg Gin Ser Ala Glu Val Glu Asn Met Met He 

995 1000 1005 

Ser Val Glu Arg Val He Glu Tyr Thr Asp Leu Glu Lys Glu Ala Pro 

1010 1015 1020 

Trp Glu Tyr Gin Lys Arg Pro Pro Pro Ala Trp Pro His Glu Gly Val 
1025 1030 1035 1040 

He He Phe Asp Asn Val Asn Phe Met Tyr Ser Pro Gly Gly Pro Leu 

. 1045 1050 1055 

Val Leu Lys His Leu Thr Ala Leu He Lys Ser Gin Glu Lys Val Gly 

1060 1065 1070 

He Val Gly Arg Thr Gly Ala Gly Lys Ser Ser Leu He Ser Ala Leu 

1075 1080 1085 

Phe Arg Leu Ser Glu Pro Glu Gly Lys lie Trp He Asp Lys He Leu 

1090 1095 1100 

Thr Thr Glu He Gly Leu His Asp Leu Arg Lys Lys Met Ser He He 
1105 mo ins ii 20 

Pro Gin Glu Pro Val Leu Phe Thr Gly Thr Met Arg Lys Asn Leu Asp 

1125 H30 1135 

Pro Phe Lys Glu His Thr Asp Glu Glu Leu Trp Asn Ala Leu Arg Glu 

1140 H45 H50 

Val Gin Leu Lys Glu Thr He Glu Asp Leu Pro Gly Lys Met Asp Thr 

1155 H60 H65 

Glu Leu Ala Glu Ser Gly Ser Asn Phe Ser Val Gly Gin Arg Gin Leu 

1170 H75 H80 

Val Cys Leu Ala Arg Ala He Leu Arg Lys Asn Gin He Leu He He 
1185 H90 H95 1200 

Asp Glu Ala Thr Ala Asn Val Asp Pro Arg Thr Asp Glu Leu He Gin 

1205 1210 1215 

Lys Lys He Arg Glu Lys Phe Ala His Cys Thr Val Leu Thr He Ala 

1220 1225 1230 

His Arg Leu Asn Thr He He Asp Ser Asp Lys He Met Val Leu Asp 
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1235 1240 1245 

Ser Gly Arg Leu Lys Glu Tyr Asp Glu Pro Tyr Val Leu Leu Gin Asn 

1250 1255 1260 

Lys Glu Ser Leu Phe Tyr Lys Met Val Gin Gin Leu Gly Lys Ala Glu 
1265 1270 1275 1281 

Ala Ala Ala Leu Thr Glu Thr Ala Lys Gin Val Tyr Phe Lys Arg Asn 

1285 1290 1295 

Tyr Pro His lie Gly His Thr Asp His Met Val Thr Asn Thr Ser Asn 



<210> 3 

<211> 5838 

<212> DNA 

<213> Homo sapiens 

<400> 3 

ccgggcaggt ggctcatgct cgggagcgtg gttgagcggc tggcgcggtt gtcctggagc 60 

aggggcgcag gaattctgat gtgaaactaa cagtctgtga gccctggaac ctccgctcag 120 

agaagatgaa ggatatcgac ataggaaaag agtatatcat ccccagtcct gggtatagaa 180 

gtgtgaggga gagaaccagc acttctggga cgcacagaga ccgtgaagat tccaagttca 24 0 

ggagaactcg accgttggaa tgccaagatg ccttggaaac agcagcccga gccgagggcc 300 

tctctcttga tgcctccatg cattctcagc tcagaatcct ggatgaggag catcccaagg 360 

gaaagtacca tcatggcttg agtgctctga agcccatccg gactacttcc aaacaccagc 420 

acccagtgga caatgctggg cttttttcct gtatgacttt ttcgtggctt tcttctctgg 480 

cccgtgtggc ccacaagaag ggggagctct caatggaaga cgtgtggtct ctgtccaagc 540 

acgagtcttc tgacgtgaac tgcagaagac tagagagact gtggcaagaa gagctgaatg 600 

aagttgggcc agacgctgct tccctgcgaa gggttgtgtg gatcttctgc cgcaccaggc 660 

tcatcctgtc catcgtgtgc ctgatgatca cgcagctggc tggcttcagt ggaccagcct 720 

tcatggtgaa acacctcttg gagtataccc aggcaacaga gtctaacctg cagtacagct 780 

tgttgttagt gctgggcctc ctcctgacgg aaatcgtgcg gtcttggtcg cttgcactga 840 

cttgggcatt gaattaccga accggtgtcc gcttgcgggg ggccatccta accatggcat 900 

ttaagaagat ccttaagtta aagaacatta aagagaaatc cctgggtgag ctcatcaaca 960 

tttgctccaa cgatgggcag agaatgtttg aggcagcagc cgttggcagc ctgctggctg 1020 

gaggacccgt tgttgccatc ttaggcatga tttataatgt aattattctg ggaccaacag 1080 

gcttcctggg atcagctgtt tttatcctct tttacccagc aatgatgttt gcatcacggc 1140 

tcacagcata tttcaggaga aaatgcgtgg ccgccacgga tgaacgtgtc cagaagatga 1200 

atgaagttct tacttacatt aaatttatca aaatgtatgc ctgggtcaaa gcattttctc 1260 

agagtgttca aaaaatccgc gaggaggagc gtcggatatt ggaaaaagcc gggtacttcc 132 0 

agggtatcac tgtgggtgtg gctcccattg tggtggtgat tgccagcgtg gtgaccttct 1380 

ctgttcatat gaccctgggc ttcgatctga cagcagcaca ggctttcaca gtggtgacag 1440 

tcttcaattc catgactttt gctttgaaag taacaccgtt ttcagtaaag tccctctcag 1500 

aagcctcagt ggctgttgac agatttaaga gtttgtttct aatggaagag gttcacatga 1560 

taaagaacaa accagccagt cctcacatca agatagagat gaaaaatgcc accttggcat 1620 

gggactcctc ccactccagt atccagaact cgcccaagct gacccccaaa atgaaaaaag 1680 

acaagagggc ttccaggggc aagaaagaga aggtgaggca gctgcagcgc actgagcatc 1740 

aggcggtgct ggcagagcag aaaggccacc tcctcctgga cagtgacgag cggcccagtc 1800 

ccgaagagga agaaggcaag cacatccacc tgggccacct gcgcttacag aggacactgc 1860 

acagcatcga tctggagatc caagagggta aactggttgg aatctgcggc agtgtgggaa 192 0 

gtggaaaaac ctctctcatt tcagccattt taggccagat gacgcttcta gagggcagca 1980 

ttgcaatcag tggaaccttc gcttatgtgg cccagcaggc ctggatcctc aatgctactc 2040 

tgagagacaa catcctgttt gggaaggaat atgatgaaga aagatacaac tctgtgctga 2100 

acagctgctg cctgaggcct gacctggcca ttcttcccag cagcgacctg acggagattg 2160 

gagagcgagg agccaacctg agcggtgggc agcgccagag gatcagcctt gcccgggcct 2220 

tgtatagtga caggagcatc tacatcctgg acgaccccct cagtgcctta gatgcccatg 2280 

tgggcaacca catcttcaat agtgctatcc ggaaacatct caagtccaag acagttctgt 2340 

ttgttaccca ccagttacag tacctggttg actgtgatga agtgatcttc atgaaagagg 2400 

gctgtattac ggaaagaggc acccatgagg aactgatgaa tttaaatggt gactatgcta 24 60 

ccatttttaa taacctgttg ctgggagaga caccgccagt tgagatcaat tcaaaaaagg 2520 

aaaccagtgg ttcacagaag aagtcacaag acaagggtcc taaaacagga tcagtaaaga 2580 

aggaaaaagc agtaaagcca gaggaagggc agcttgtgca gctggaagag aaagggcagg 2 640 

gttcagtgcc ctggtcagta tatggtgtct acatccaggc tgctgggggc cccttggcat 2700 

tcctggttat tatggccctt ttcatgctga atgtaggcag caccgccttc agcacctggt 2760 

ggttgagtta ctggatcaag caaggaagcg ggaacaccac tgtgactcga gggaacgaga 2820 

cctcggtgag tgacagcatg aaggacaatc ctcatatgca gtactatgcc agcatctacg 2880 
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ccctctccat ggcagtcatg ctgatcctga aagccattcg aggagttgtc tttgtcaagg 2940 

gcacgctgcg agcttcctcc cggctgcatg acgagctttt ccgaaggatc cttcgaagcc 3000 

ctatgaagtt ttttgacacg acccccacag ggaggattct caacaggttt tccaaagaca 3060 

tggatgaagt tgacgtgcgg ctgccgttcc aggccgagat gttcatccag aacgttatcc 3120 

tggtgttctt ctgtgtggga atgatcgcag gagtcttccc gtggttcctt gtggcagtgg 3180 

ggccccttgt catcctcttt tcagtcctgc acattgtctc cagggtcctg attcgggagc 3240 

tgaagcgtct ggacaatatc acgcagtcac ctttcctctc ccacatcacg tccagcatac 3300 

agggccttgc caccatccac gcctacaata aagggcagga gtttctgcac agataccagg 3360 

agctgctgga tgacaaccaa gctccttttt ttttgtttac gtgtgcgatg cggtggctgg 3420 

ctgtgcggct ggacctcatc agcatcgccc tcatcaccac cacggggctg atgatcgttc 3480 

ttatgcacgg gcagattccc ccagcctatg cgggtctcgc catctcttat gctgtccagt 3540 

taacggggct gttccagttt acggtcagac tggcatctga gacagaagct cgattcacct 3 600 

cggtggagag gatcaatcac tacattaaga ctctgtcctt ggaagcacct gccagaatta 3 660 

agaacaaggc tccctcccct gactggcccc aggagggaga ggtgaccttt gagaacgcag 3720 

agatgaggta ccgagaaaac ctccctcttg tcctaaagaa agtatccttc acgatcaaac 3780 

ctaaagagaa gattggcatt gtggggcgga caggatcagg gaagtcctcg ctggggatgg 3840 

ccctcttccg tctggtggag ttatctggag gctgcatcaa gattgatgga gtgagaatca 3900 

gtgatattgg ccttgccgac ctccgaagca aactctctat cattcctcaa gagccggtgc 3960 

tgttcagtgg cactgtcaga tcaaatttgg accccttcaa ccagtacact gaagaccaga 4020 

tttgggatgc cctggagagg acacacatga aagaatgtat tgctcagcta cctctgaaac 4080 

ttgaatctga agtgatggag aatggggata acttctcagt gggggaacgg cagctcttgt 4140 

gcatagctag agccctgctc cgccactgta agattctgat tttagatgaa gccacagctg 420 0 

ccatggacac agagacagac ttattgattc aagagaccat ccgagaagca tttgcagact 4260 

gtaccatgct gaccattgcc catcgcctgc acacggttct aggctccgat aggattatgg 4320 

tgctggccca gggacaggtg gtggagtttg acaccccatc ggtccttctg tccaacgaca 4380 

gttcccgatt ctatgccatg tttgctgctg cagagaacaa ggtcgctgtc aagggctgac 4440 

tcctccctgt tgacgaagtc tcttttcttt agagcattgc cattccctgc ctggggcggg 4500 

cccctcatcg cgtcctccta ccgaaacctt gcctttctcg attttatctt tcgcacagca 4560 

gttccggatt ggcttgtgtg tttcactttt agggagagtc atattttgat tattgtattt 4620 

attccatatt catgtaaaca aaatttagtt tttgttctta attgcactct aaaaggttca 4680 

gggaaccgtt attataattg tatcagaggc ctataatgaa gctttatacg tgtagctata 4740 

tctatatata attctgtaca tagcctatat ttacagtgaa aatgtaagct gtttatttta 4800 

tattaaaata agcactgtgc taataacagt gcatattcct ttctatcatt tttgtacagt 4860 

ttgctgtact agagatctgg ttttgctatt agactgtagg aagagtagca tttcattctt 4920 

ctctagctgg tggtttcacg gtgccaggtt ttctgggtgt ccaaaggaag acgtgtggca 4980 

atagtgggcc ctccgacagc cccctctgcc gcctccccac agccgctcca ggggtggctg 5040 

gagacgggtg ggcggctgga gaccatgcag agcgccgtga gttctcaggg ctcctgcctt 5100 

ctgtcctggt gtcacttact gtttctgtca ggagagcagc ggggcgaagc ccaggcccct 5160 

tttcactccc tccatcaaga atggggatca cagagacatt cctccgagcc ggggagtttc 5220 

tttcctgcct tcttcttttt gctgttgttt ctaaacaaga atcagtctat ccacagagag 5280 

tcccactgcc tcaggttcct atggctggcc actgcacaga gctctccagc tccaagacct 5340 

gttggttcca agccctggag ccaactgctg ctttttgagg tggcactttt tcatttgcct 5400 

attcccacac ctccacagtt cagtggcagg gctcaggatt tcgtgggtct gttttccttt 5460 

ctcaccgcag tcgtcgcaca gtctctctct ctctctcccc tcaaagtctg caactttaag 5520 

cagctcttgc taatcagtgt ctcacactgg cgtagaagtt tttgtactgt aaagagacct 5580 

acctcaggtt gctggttgct gtgtggtttg gtgtgttccc gcaaaccccc tttgtgctgt 5640 

ggggctggta gctcaggtgg gcgtggtcac tgctgtcatc agttgaatgg tcagcgttgc 5700 

atgtcgtgac caactagaca ttctgtcgcc ttagcatgtt tgctgaacac cttgtggaag 57 60 

caaaaatctg aaaatgtgaa taaaattatt ttggattttg taaaaaaaaa aaaaaaaaaa 5820 

aaaaaaaaaa aaaaaaaa 5838 



<210> 4 
<211> 1437 
<212> PRT 

<213> Homo sapiens 
<400> 4 

Met Lys Asp lie Asp lie Gly Lys Glu Tyr lie lie Pro Ser Pro Gly 

1 5 10 15 

Tyr Arg Ser Val Arg Glu Arg Thr Ser Thr Ser Gly Thr His Arg Asp 

20 25 30 

Arg Glu Asp Ser Lys Phe Arg Arg Thr Arg Pro Leu Glu Cys Gin Asp 

35 40 45 

Ala Leu Glu Thr Ala Ala Arg Ala Glu Gly Leu Ser Leu Asp Ala Ser 
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50 

Met His Ser Gin 
65 

Tyr His His Gly 

His Gin His Pro 
100 

Ser Trp Leu Ser 

115 

Ser Met Glu Asp 
130 

Asn Cys Arg Arg 
145 

Gly Pro Asp Ala 

Thr Arg Leu lie 
180 

Gly Phe Ser Gly 
195 

Gin Ala Thr Glu 

210 

Leu Leu Leu Thr 
225 

Ala Leu Asn Tyr 

Met Ala Phe Lys 
260 

Leu Gly Glu Leu 
275 

Glu Ala Ala Ala 
290 

lie Leu Gly Met 
305 

Leu Gly Ser Ala 

Ser Arg Leu Thr 

340 

Glu Arg Val Gin 
355 

Lys Met Tyr Ala 
370 

Arg Glu Glu Glu 
385 

lie Thr Val Gly 

Thr Phe Ser Val 
420 

Ala Phe Thr Val 
435 

Val Thr Pro Phe 
450 

Asp Arg Phe Lys 
465 

Asn Lys Pro Ala 

Leu Ala Trp Asp 
500 

Thr Pro Lys Met 
515 

Lys Val Arg Gin 
530 

Gin Lys Gly His 
545 

Glu Glu Glu Gly 
Thr Leu His Ser 



55 

Leu Arg lie Leu 
70 

Leu Ser Ala Leu 
85 

Val Asp Asn Ala 

Ser Leu Ala Arg 
12 0 

Val Trp Ser Leu 
135 

Leu Glu Arg Leu 
150 

Ala Ser Leu Arg 
165 

Leu Ser lie Val 

Pro Ala Phe Met 
200 

Ser Asn Leu Gin 
215 

Glu lie Val Arg 
230 

Arg Thr Gly Val 
245 

Lys lie Leu Lys 

lie Asn He Cys 
280 

Val Gly Ser Leu 
295 

He Tyr Asn Val 
310 

Val Phe He Leu 
325 

Ala Tyr Phe Arg 

Lys Met Asn Glu 
360 

Trp Val Lys Ala 
375 

Arg Arg He Leu 
390 

Val Ala Pro He 
405 

His Met Thr Leu 

Val Thr Val Phe 
440 

Ser Val Lys Ser 
455 

Ser Leu Phe Leu 
470 

Ser Pro His He 
485 

Ser Ser His Ser 

Lys Lys Asp Lys 
520 

Leu Gin Arg Thr 
535 

Leu Leu Leu Asp 
550 

Lys His He His 
565 

He Asp Leu Glu 



60 

Asp Glu Glu His 
75 

Lys Pro He Arg 
90 

Gly Leu Phe Ser 
105 

Val Ala His Lys 

Ser Lys His Glu 
140 

Trp Gin Glu Glu 
155 

Arg Val Val Trp 
170 

Cys Leu Met He 
185 

Val Lys His Leu 

Tyr Ser Leu Leu 
220 

Ser Trp Ser Leu 
235 

Arg Leu Arg Gly 
250 

Leu Lys Asn He 
265 

Ser Asn Asp Gly 

Leu Ala Gly Gly 
300 

He He Leu Gly 
315 

Phe Tyr Pro Ala 
330 

Arg Lys Cys Val 
345 

Val Leu Thr Tyr 

Phe Ser Gin Ser 
380 

Glu Lys Ala Gly 
395 

Val Val Val He 
410 

Gly Phe Asp Leu 
425 

Asn Ser Met Thr 

Leu Ser Glu Ala 
460 

Met Glu Glu Val 
475 

Lys He Glu Met 
490 

Ser He Gin Asn 
505 

Arg Ala Ser Arg 

Glu His Gin Ala 
540 

Ser Asp Glu Arg 
555 

Leu Gly His Leu 
570 

He Gin Glu Gly 



Pro Lys Gly Lys 
80 

Thr Thr Ser Lys 
95 

Cys Met Thr Phe 
110 

Lys Gly Glu Leu 
125 

Ser Ser Asp Val 

Leu Asn Glu Val 
160 

He Phe Cys Arg 
175 

Thr Gin Leu Ala 
190 

Leu Glu Tyr Thr 
205 

Leu Val Leu Gly 

Ala Leu Thr Trp 
240 

Ala He Leu Thr 

255 

Lys Glu Lys Ser 
270 

Gin Arg Met Phe 
285 

Pro Val Val Ala 

Pro Thr Gly Phe 
320 

Met Met Phe Ala 
335 

Ala Ala Thr Asp 
350 

He Lys Phe He 
365 

Val Gin Lys He 

Tyr Phe Gin Gly 
400 

Ala Ser Val Val 
415 

Thr Ala Ala Gin 
430 

Phe Ala Leu Lys 
445 

Ser Val Ala Val 

His Met He Lys 
480 

Lys Asn Ala Thr 
495 

Ser Pro Lys Leu 
510 

Gly Lys Lys Glu 
525 

Val Leu Ala Glu 

Pro Ser Pro Glu 
560 

Arg Leu Gin Arg 
575 

Lys Leu Val Gly 
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580 585 590 

lie Cys Gly Ser Val Gly Ser Gly Lys Thr Ser Leu lie Ser Ala lie 

595 600 605 

Leu Gly Gin Met Thr Leu Leu Glu Gly Ser lie Ala lie Ser Gly Thr 

610 615 620 

Phe Ala Tyr Val Ala Gin Gin Ala Trp lie Leu Asn Ala Thr Leu Arg 
625 630 635 640 

Asp Asn lie Leu Phe Gly Lys Glu Tyr Asp Glu Glu Arg Tyr Asn Ser 

645 650 655 

Val Leu Asn Ser Cys Cys Leu Arg Pro Asp Leu Ala lie Leu Pro Ser 

660 665 670 

Ser Asp Leu Thr Glu lie Gly Glu Arg Gly Ala Asn Leu Ser Gly Gly 

675 680 685 

Gin Arg Gin Arg lie Ser Leu Ala Arg Ala Leu Tyr Ser Asp Arg Ser 

690 695 700 

lie Tyr lie Leu Asp Asp Pro Leu Ser Ala Leu Asp Ala His Val Gly 
705 710 715 720 

Asn His lie Phe Asn Ser Ala lie Arg Lys His Leu Lys Ser Lys Thr 

725 730 735 

Val Leu Phe Val Thr His Gin Leu Gin Tyr Leu Val Asp Cys Asp Glu 

740 745 750 

Val lie Phe Met Lys Glu Gly Cys lie Thr Glu Arg Gly Thr His Glu 

755 760 765 

Glu Leu Met Asn Leu Asn Gly Asp Tyr Ala Thr He Phe Asn Asn Leu 

770 775 780 

Leu Leu Gly Glu Thr Pro Pro Val Glu He Asn Ser Lys Lys Glu Thr 
785 790 795 800 

Ser Gly Ser Gin Lys Lys Ser Gin Asp Lys Gly Pro Lys Thr Gly Ser 

805 810 815 

Val Lys Lys Glu Lys Ala Val Lys Pro Glu Glu Gly Gin Leu Val Gin 

820 825 830 

Leu Glu Glu Lys Gly Gin Gly Ser Val Pro Trp Ser Val Tyr Gly Val 

835 840 845 

Tyr He Gin Ala Ala Gly Gly Pro Leu Ala Phe Leu Val He Met Ala 

850 855 860 

Leu Phe Met Leu Asn Val Gly Ser Thr Ala Phe Ser Thr Trp Trp Leu 
865 870 875 880 

Ser Tyr Trp He Lys Gin Gly Ser Gly Asn Thr Thr Val Thr Arg Gly 

885 890 895 

Asn Glu Thr Ser Val Ser Asp Ser Met Lys Asp Asn Pro His Met Gin 

900 905 910 

Tyr Tyr Ala Ser He Tyr Ala Leu Ser Met Ala Val Met Leu He Leu 

915 920 925 

Lys Ala He Arg Gly Val Val Phe Val Lys Gly Thr Leu Arg Ala Ser 

930 935 940 

Ser Arg Leu His Asp Glu Leu Phe Arg Arg He Leu Arg Ser Pro Met 
945 950 955 960 

Lys Phe Phe Asp Thr Thr Pro Thr Gly Arg He Leu Asn Arg Phe Ser 

965 970 975 

Lys Asp Met Asp Glu Val Asp Val Arg Leu Pro Phe Gin Ala Glu Met 

980 985 990 

Phe He Gin Asn Val He Leu Val Phe Phe Cys Val Gly Met He Ala 

995 1000 1005 

Gly Val Phe Pro Trp Phe Leu Val Ala Val Gly Pro Leu Val He Leu 

1010 1015 1020 

Phe Ser Val Leu His He Val Ser Arg Val Leu He Arg Glu Leu Lys 
1025 1030 1035 1040 

Arg Leu Asp Asn He Thr Gin Ser Pro Phe Leu Ser His He Thr Ser 

1045 1050 1055 

Ser He Gin Gly Leu Ala Thr He His Ala Tyr Asn Lys Gly Gin Glu 

1060 1065 1070 

Phe Leu His Arg Tyr Gin Glu Leu Leu Asp Asp Asn Gin Ala Pro Phe 

1075 1080 1085 

Phe Leu Phe Thr Cys Ala Met Arg Trp Leu Ala Val Arg Leu Asp Leu 

1090 1095 1100 

He Ser He Ala Leu He Thr Thr Thr Gly Leu Met He Val Leu Met 
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1105 mo ins H20 

Has Gly Gin lie Pro Pro Ala Tyr Ala Gly Leu Ala lie Ser Tyr Ala 

1125 1130 1135 

Val Gin Leu Thr Gly Leu Phe Gin Phe Thr Val Arg Leu Ala Ser Glu 

1140 1145 1150 

Thr Glu Ala Arg Phe Thr Ser Val Glu Arg lie Asn His Tyr lie Lys 

1155 1160 H65 

Thr Leu Ser Leu Glu Ala Pro Ala Arg He Lys Asn Lys Ala Pro Ser 

1170 H75 H80 

Pro Asp Trp Pro Gin Glu Gly Glu Val Thr Phe Glu Asn Ala Glu Met 
1185 H90 H95 1200 

Arg Tyr Arg Glu Asn Leu Pro Leu Val Leu Lys Lys Val Ser Phe Thr 

1205 1210 1215 

He Lys Pro Lys Glu Lys He Gly He Val Gly Arg Thr Gly Ser Gly 

1220 1225 1230 

Lys Ser Ser Leu Gly Met Ala Leu Phe Arg Leu Val Glu Leu Ser Gly 

1235 1240 1245 

Gly Cys He Lys He Asp Gly Val Arg He Ser Asp He Gly Leu Ala 

1250 1255 1260 

Asp Leu Arg Ser Lys Leu Ser He He Pro Gin Glu Pro Val Leu Phe 
1265 1270 1275 1280 

Ser Gly Thr Val Arg Ser Asn Leu Asp Pro Phe Asn Gin Tyr Thr Glu 

1285 1290 1295 

Asp Gin He Trp Asp Ala Leu Glu Arg Thr His Met Lys Glu Cys He 

1300 1305 1310 

Ala Gin Leu Pro Leu Lys Leu Glu Ser Glu Val Met Glu Asn Gly Asp 

1315 1320 1325 

Asn Phe Ser Val Gly Glu Arg Gin Leu Leu Cys He Ala Arg Ala Leu 

1330 1335 1340 

Leu Arg His Cys Lys He Leu He Leu Asp Glu Ala Thr Ala Ala Met 
1345 1350 1355 1360 

Asp Thr Glu Thr Asp Leu Leu He Gin Glu Thr He Arg Glu Ala Phe 

1365 1370 1375 

Ala Asp Cys Thr Met Leu Thr He Ala His Arg Leu His Thr Val Leu 

1380 1385 1390 

Gly Ser Asp Arg He Met Val' Leu Ala Gin Gly Gin Val Val Glu Phe 

1395 1400 1405 

Asp Thr Pro Ser Val Leu Leu Ser Asn Asp Ser Ser Arg Phe Tyr Ala 

1410 1415 1420 

Met Phe Ala Ala Ala Glu Asn Lys Val Ala Val Lys Gly 
1425 1430 1435 



<210> 5 
<211> 5079 
<212> DNA 

<213> Homo sapiens 



<400> 5 

ccccatggac gccctgtgcg gttccgggga gctcggctcc aagttctggg actccaacct 60 

gtctgtgcac acagaaaacc cggacctcac tccctgcttc cagaactccc tgctggcctg 120 

ggtgccctgc atctacctgt gggtcgccct gccctgctac ttgctctacc tgcggcacca 180 

ttgtcgtggc tacatcatcc tctcccacct gtccaagctc aagatggtcc tgggtgtcct 240 

gctgtggtgc gtctcctggg cggacctttt ttactccttc catggcctgg tccatggccg 300 

ggcccctgcc cctgttttct ttgtcacccc cttggtggtg ggggtcacca tgctgctggc 360 

caccctgctg atacagtatg agcggctgca gggcgtacag tcttcggggg tcctcattat 420 

cttctggttc ctgtgtgtgg tctgcgccat cgtcccattc cgctccaaga tccttttagc 480 

caaggcagag ggtgagatct cagacccctt ccgcttcacc accttctaca tccactttgc 540 

cctggtactc tctgccctca tcttggcctg cttcagggag aaacctccat ttttctccgc 600 

aaagaatgtc gaccctaacc cctaccctga gaccagcgct ggctttctct cccgcctgtt 660 

tttctggtgg ttcacaaaga tggccatcta tggctaccgg catcccctgg aggagaagga 72 0 

cctctggtcc ctaaaggaag aggacagatc ccagatggtg gtgcagcagc tgctggaggc 780 

atggaggaag caggaaaagc agacggcacg acacaaggct tcagcagcac ctgggaaaaa 840 

tgcctccggc gaggacgagg tgctgctggg tgcccggccc aggccccgga agccctcctt 9 00 

cctgaaggcc ctgctggcca ccttcggctc cagcttcctc atcagtgcct gcttcaagct 9 60 

tatccaggac ctgctctcct tcatcaatcc acagctgctc agcatcctga tcaggtttat 1020 

ctccaacccc atggccccct cctggtgggg cttcctggtg gctgggctga tgttcctgtg 1080 
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ctccatgatg cagtcgctga tcttacaaca ctattaccac tacatctttg tgacnggggt 1140 
gaagtttcgt actgggatca tgggtgtcat ctacaggaag gctctggtta tcaccaactc 1200 
agtcaaacgt gcgtccactg tgggggaaat tgtcaacctc atgtcagtgg atgcccagcg 1260 
cttcatggac cttgccccct tcctcaatct gctgtggtca gcacccctgc agaccatcct 1320 
ggcgatctac ttcctctggc agaacctagg tccctctgtc ctggctggag tcgccttcat 1380 
ggtcttgctg attccactca acggagctgt ggccgtgaag atgcgcgcct tccaggtaaa 1440 
gcaaatgaaa ttgaaggact cgcgcatcaa gctgatgagt gagatcctga acgacatcaa 1500 
ggtgctgaag ctgtacgcct gggagcccag cttcctgaag caggtggagg gcatcaggca 1560 
gggtgagctc cagctgctgc gcacggcggc ctacctccac accacaacca ccttcacctg 1620 

gatgtgcagc cccttcctgg tgaccctgat caccctctgg gtgtacgtgt acgtggaccc 1680 

aaacaatgtg ctggacgccg agaaggcctt tgtgtctgtg tccttgttta atatcttaag 1740 

acttcccctc aacatgctgc cccagttaat cagcaacctg actcaggcca gtgtgtctct 1800 

gaaacggatc cagcaattcc tgagccaaga ggaacttgac ccccagagtg tggaaagaaa 1860 

gaccatctcc ccaggctatg ccatcaccat acacagtggc accttcacct gggcccagga 1920 

cctgcccccc actctgcaca gcctagacat ccaggtcccg aaaggggcac tggtggccgt 1980 

ggtggggcct gtgggctgtg ggaagtcctc cctggtgtct gccctgctgg gagagatgga 2040 

gaagctagaa ggcaaagtgc acatgaaggg ctccgtggcc tatgtgcccc agcaggcatg 2100 

gatccagaac tgcactcttc aggaaaacgt gcttttcggc aaagccctga accccaagcg 2160 

ctaccagcag actctggagg cctgtgcctt gctagctgac ctggagatgc tgcctggtgg 2220 

ggatcagaca gagattggag agaagggcat taacctgtct gggggccagc ggcagcgggt 2280 

cagtctggct cgagctgttt acagtgatgc cgatattttc ttgctggatg acccactgtc 2340 

cgcggtggac tctcatgtgg ccaagcacat ctttgaccac gtcatcgggc cagaaggcgt 2400 

gctggcaggc aagacgcgag tgctggtgac gcacggcatt agcttcctgc cccagacaga 24 60 

cttcatcatt gtgctagctg atggacaggt gtctgagatg ggcccgtacc cagccctgct 2520 

gcagcgcaac ggctcctttg ccaactttct ctgcaactat gcccccgatg aggaccaagg 2580 

gcacctggag gacagctgga ccgcgttgga aggtgcagag gataaggagg cactgctgat 2640 

tgaagacaca ctcagcaacc acacggatct gacagacaat gatccagtca cctatgtggt 2700 

ccagaagcag tttatgagac agctgagtgc cctgtcctca gatggggagg gacagggtcg 2760 

gcctgtaccc cggaggcacc tgggtccatc agagaaggtg caggtgacag aggcgaaggc 282 0 

agatggggca ctgacccagg aggagaaagc agccattggc actgtggagc tcagtgtgtt 2880 

ctgggattat gccaaggccg tggggctctg taccacgctg gccatctgtc tcctgtatgt 2940 

gggtcaaagt gcggctgcca ttggagccaa tgtgtggctc agtgcctgga caaatgatgc 3 00 0 

catggcagac agtagacaga acaacacttc cctgaggctg ggcgtctatg ctgctttagg 3060 

aattctgcaa gggttcttgg tgatgctggc agccatggcc atggcagcgg gtggcatcca 3120 

ggctgcccgt gtgttgcacc aggcactgct gcacaacaag atacgctcgc cacagtcctt 3180 

ctttgacacc acaccatcag gccgcatcct gaactgcttc tccaaggaca tctatgtcgt 3240 

tgatgaggtt ctggcccctg tcatcctcat gctgctcaat tccttcttca acgccatctc 3300 

cactcttgtg gtcatcatgg ccagcacgcc gctcttcact gtggtcatcc tgcccctggc 3360 

tgtgctctac accttagtgc agcgcttcta tgcagccaca tcacggcaac tgaagcggct 3420 

ggaatcagtc agccgctcac ctatctactc ccacttttcg gagacagtga ctggtgccag 3480 

tgtcatccgg gcctacaacc gcagccggga ttttgagatc atcagtgata ctaaggtgga 3540 

tgccaaccag agaagctgct acccctacat catctccaac cggtggctga gcatcggagt 3600 

ggagttcgtg gggaactgcg tggtgctctt tgctgcacta tttgccgtca tcgggaggag 3 660 

cagcctgaac ccggggctgg tgggcctttc tgtgtcctac tccttgcagg tgacatttgc 3720 

tctgaactgg atgatacgaa tgatgtcaga tttggaatct aacatcgtgg ctgtggagag 3780 

ggtcaaggag tactccaaga cagagacaga ggcgccctgg gtggtggaag gcagccgccc 3840 

tcccgaaggt tggcccccac gtggggaggt ggagttccgg aattattctg tgcgctaccg 3900 

gccgggccta gacctggtgc tgagagacct gagtctgcat gtgcacggtg gcgagaaggt 3960 

ggggatcgtg ggccgcactg gggctggcaa gtcttccatg accctttgcc tgttccgcat 4020 

cctggaggcg gcaaagggtg aaatccgcat tgatggcctc aatgtggcag acatcggcct 4080 

ccatgacctg cgctctcagc tgaccatcat cccgcaggac cccatcctgt tctcggggac 4140 

cctgcgcatg aacctggacc ccttcggcag ctactcagag gaggacattt ggtgggcttt 4200 

ggagctgtcc cacctgcaca cgtttgtgag ctcccagccg gcaggcctgg acttccagtg 4260 

ctcagagggc ggggagaatc tcagcgtggg ccagaggcag ctcgtgtgcc tggcccgagc 4320 

cctgctccgc aagagccgca tcctggtttt agacgaggcc acagctgcca tcgacctgga 4380 

gactgacaac ctcatccagg ctaccatccg cacccagttt gatacctgca ctgtcctgac 4440 

catcgcacac cggcttaaca ctatcatgga ctacaccagg gtcctggtcc tggacaaagg 4500 

agtagtagct gaatttgatt ctccagccaa cctcattgca gctagaggca tcttctacgg 4560 

gatggccaga gatgctggac ttgcctaaaa tatattcctg agatttcctc ctggcctttc 4620 

ctggttttca tcaggaagga aatgacacca aatatgtccg cagaatggac ttgatagcaa 4 680 

acactggggg caccttaaga ttttgcacct gtaaagtgcc ttacagggta actgtgctga 4740 

atgctttaga tgaggaaatg atccccaagt ggtgaatgac acgcctaagg tcacagctag 4800 

tttgagccag ttagactagt ccccggtctc ccgattccca actgagtgtt atttgcacac 4860 

tgcactgttt tcaaataacg attt tatgaa atgacctctg tcctccctct gatttttcat 4920 

attttctaaa gtttcgtttc tgttttttaa taaaaagctt tttcctcctg gaacagaaga 4980 

cagctgctgg gtcaggccac ccctaggaac tcagtcctgt actctggggt gctgcctgaa 5040 
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tccattaaaa atgggagtac tgatgaaata aaactacag 5079 



<210> 6 

<211> 1527 

<212> PRT 

<213> Homo sapiens 

<400> 6 

Met Asp Ala Leu Cys Gly Ser Gly Glu Leu Gly Ser Lys Phe Trp Asp 

15 10 15 

Ser Asn Leu Ser Val His Thr Glu Asn Pro Asp Leu Thr Pro Cys Phe 

20 25 30 

Gin Asn Ser Leu Leu Ala Trp Val Pro Cys lie Tyr Leu Trp Val Ala 

35 40 45 

Leu Pro Cys Tyr Leu Leu Tyr Leu Arg His His Cys Arg Gly Tyr lie 

50 55 60 

lie Leu Ser His Leu Ser Lys Leu Lys Met Val Leu Gly Val Leu Leu 
65 70 75 80 

Trp Cys Val Ser Trp Ala Asp Leu Phe Tyr Ser Phe His Gly Leu Val 

85 90 95 

His Gly Arg Ala Pro Ala Pro Val Phe Phe Val Thr Pro Leu Val Val 

100 105 110 

Gly Val Thr Met Leu Leu Ala Thr Leu Leu lie Gin Tyr Glu Arg Leu 

115 120 125 

Gin Gly Val Gin Ser Ser Gly Val Leu lie lie Phe Trp Phe Leu Cys 

130 135 140 

Val Val Cys Ala lie Val Pro Phe Arg Ser Lys lie Leu Leu Ala Lys 
145 150 155 160 

Ala Glu Gly Glu lie Ser Asp Pro Phe Arg Phe Thr Thr Phe Tyr lie 

165 170 175 

His Phe Ala Leu Val Leu Ser Ala Leu lie Leu Ala Cys Phe Arg Glu 

180 185 190 

Lys Pro Pro Phe Phe Ser Ala Lys Asn Val Asp Pro Asn Pro Tyr Pro 

195 200 205 

Glu Thr Ser Val Gly Phe Leu Ser Arg Leu Phe Phe Trp Trp Phe Thr 

210 215 220 

Lys Met Ala lie Tyr Gly Tyr Arg His Pro Leu Glu Glu Lys Asp Leu 
225 230 235 240 

Trp Ser Leu Lys Glu Glu Asp Arg Ser Gin Met Val Val Gin Gin Leu 

245 250 255 

Leu Glu Ala Trp Arg Lys Gin Glu Lys Gin Thr Ala Arg His Lys Ala 

260 265 270 

Ser Ala Ala Pro Gly Lys Asn Ala Ser Gly Glu Asp Glu Val Leu Leu 

275 280 285 

Gly Ala Arg Pro Arg Pro Arg Lys Pro Ser Phe Leu Lys Ala Leu Leu 

290 295 300 

Ala Thr Phe Gly Ser Ser Phe Leu lie Ser Ala Cys Phe Lys Leu lie 
305 310 315 320 

Gin Asp Leu Leu Ser Phe lie Asn Pro Gin Leu Leu Ser lie Leu lie 

325 330 335 

Arg Phe lie Ser Asn Pro Met Ala Pro Ser Trp Trp Gly Phe Leu Val 

340 345 350 

Ala Gly Leu Met Phe Leu Cys Ser Met Met Gin Ser Leu lie Leu Gin 

355 360 365 

His Tyr Tyr His Tyr lie Phe Val Thr Gly Val Lys Phe Arg Thr Gly 

370 375 380 

He Met Gly Val He Tyr Arg Lys Ala Leu Val He Thr Asn Ser Val 
385 390 395 400 

Lys Arg Ala Ser Thr Val Gly Glu He Val Asn Leu Met Ser Val Asp 

405 410 415 

Ala Gin Arg Phe Met Asp Leu Ala Pro Phe Leu Asn Leu Leu Trp Ser 

420 425 430 

Ala Pro Leu Gin He He Leu Ala He Tyr Phe Leu Trp Gin Asn Leu 
435 440 445 



SUBSTITUTE SHEET (RULE 26) 



♦ 



WO 99/49735 PCT/US99/06644 



12/19 

Gly Pro Ser Val Leu Ala Gly Val Ala Phe Met Val Leu Leu lie Pro 

450 455 460 

Leu Asn Gly Ala Val Ala Val Lys Met Arg Ala Phe Gin Val Lys Gin 
465 470 475 480 

Met Lys Leu Lys Asp Ser Arg lie Lys Leu Met Ser Glu lie Leu Asn 

485 490 495 

Gly lie Lys Val Leu Lys Leu Tyr Ala Trp Glu Pro Ser Phe Leu Lys 

500 505 510 

Gin Val Glu Gly lie Arg Gin Gly Glu Leu Gin Leu Leu Arg Thr Ala 

515 520 525 

Ala Tyr Leu His Thr Thr Thr Thr Phe Thr Trp Met Cys Ser Pro Phe 

530 535 540 

Leu Val Thr Leu He Thr Leu Trp Val Tyr Val Tyr Val Asp Pro Asn 
545 550 555 560 

Asn Val Leu Asp Ala Glu Lys Ala Phe Val Ser Val Ser Leu Phe Asn 

565 570 575 

He Leu Arg Leu Pro Leu Asn Met Leu Pro Gin Leu He Ser Asn Leu 

580 585 590 

Thr Gin Ala Ser Val Ser Leu Lys Arg He Gin Gin Phe Leu Ser Gin 

595 600 605 

Glu Glu Leu Asp Pro Gin Ser Val Glu Arg Lys Thr He Ser Pro Gly 

610 615 620 

Tyr Ala He Thr He His Ser Gly Thr Phe Thr Trp Ala Gin Asp Leu 
625 630 635 640 

Pro Pro Thr Leu His Ser Leu Asp He Gin Val Pro Lys Gly Ala Leu 

645 650 655 

Val Ala Val Val Gly Pro Val Gly Cys Gly Lys Ser Ser Leu Val Ser 

660 665 670 

Ala Leu Leu Gly Glu Met Glu Lys Leu Glu Gly Lys Val His Met Lys 

675 680 685 

Gly Ser Val Ala Tyr Val Pro Gin Gin Ala Trp He Gin Asn Cys Thr 

690 695 700 

Leu Gin Glu Asn Val Leu Phe Gly Lys Ala Leu Asn Pro Lys Arg Tyr 
705 710 715 720 

Gin Gin Thr Leu Glu Ala Cys Ala Leu Leu Ala Asp Leu Glu Met Leu 

725 730 735 

Pro Gly Gly Asp Gin Thr Glu He Gly Glu Lys Gly He Asn Leu Ser 

740 745 750 

Gly Gly Gin Arg Gin Arg Val Ser Leu Ala Arg Ala Val Tyr Ser Asp 

755 760 765 

Ala Asp He Phe Leu Leu Asp Asp Pro Leu Ser Ala Val Asp Ser His 

770 775 780 

Val Ala Lys His He Phe Asp His Val He Gly Pro Glu Gly Val Leu 
785 790 795 800 

Ala Gly Lys Thr Arg Val Leu Val Thr His Gly He Ser Phe Leu Pro 

805 810 815 

Gin Thr Asp Phe He He Val Leu Ala Asp Gly Gin Val Ser Glu Met 

820 825 830 

Gly Pro Tyr Pro Ala Leu Leu Gin Arg Asn Gly Ser Phe Ala Asn Phe 

835 840 845 

Leu Cys Asn Tyr Ala Pro Asp Glu Asp Gin Gly His Leu Glu Asp Ser 

850 855 860 

Trp Thr Ala Leu Glu Gly Ala Glu Asp Lys Glu Ala Leu Leu He Glu 
865 870 875 880 

Asp Thr Leu Ser Asn His Thr Asp Leu Thr Asp Asn Asp Pro Val Thr 

885 890 895 

Tyr Val Val Gin Lys Gin Phe Met Arg Gin Leu Ser Ala Leu Ser Ser 

900 905 910 

Asp Gly Glu Gly Gin Gly Arg Pro Val Pro Arg Arg His Leu Gly Pro 

915 920 925 

Ser Glu Lys Val Gin Val Thr Glu Ala Lys Ala Asp Gly Ala Leu Thr 

930 935 940 

Gin Glu Glu Lys Ala Ala He Gly Thr Val Glu Leu Ser Val Phe Trp 
945 950 955 960 

Asp Tyr Ala Lys Ala Val Gly Leu Cys Thr Thr Leu Ala He Cys Leu 
965 970 975 
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Leu Tyr Val Gly Gin Ser Ala Ala Ala lie Gly Ala Asn Val Trp Leu 

980 985 990 

Ser Ala Trp Thr Asn Asp Ala Met Ala Asp Ser Arg Gin Asn Asn Thr 

995 1000 1005 

Ser Leu Arg Leu Gly Val Tyr Ala Ala Leu Gly lie Leu Gin Gly Phe 

1010 1015 1020 

Leu Val Met Leu Ala Ala Met Ala Met Ala Ala Gly Gly lie Gin Ala 
1025 1030 1035 1040 

Ala Arg Val Leu His Gin Ala Leu Leu His Asn Lys lie Arg Ser Pro 

1045 1050 1055 

Gin Ser Phe Phe Asp Thr Thr Pro Ser Gly Arg lie Leu Asn Cys Phe 

1060 1065 1070 

Ser Lys Asp lie Tyr Val Val Asp Glu Val Leu Ala Pro Val lie Leu 

1075 1080 1085 

Met Leu Leu Asn Ser Phe Phe Asn Ala lie Ser Thr Leu Val Val He 

1090 1095 1100 

Met Ala Ser Thr Pro Leu Phe Thr Val Val He Leu Pro Leu Ala Val 
1105 1110 1115 1120 

Leu Tyr Thr Leu Val Gin Arg Phe Tyr Ala Ala Thr Ser Arg Gin Leu 

1125 1130 H35 

Lys Arg Leu Glu Ser Val Ser Arg Ser Pro He Tyr Ser His Phe Ser 

1140 1145 H50 

Glu Thr Val Thr Gly Ala Ser Val He Arg Ala Tyr Asn Arg Ser Arg 

1155 1160 H65 

Asp Phe Glu He He Ser Asp Thr Lys Val Asp Ala Asn Gin Arg Ser 

1170 1175 1180 

Cys Tyr Pro Tyr He He Ser Asn Arg Trp Leu Ser He Gly Val Glu 
1185 1190 1195 1200 

Phe Val Gly Asn Cys Val Val Leu Phe Ala Ala Leu Phe Ala Val He 

1205 1210 1215 

Gly Arg Ser Ser Leu Asn Pro Gly Leu Val Gly Leu Ser Val Ser Tyr 

1220 1225 1230 

Ser Leu Gin Val Thr Phe Ala Leu Asn Trp Met He Arg Met Met Ser 

1235 1240 1245 

Asp Leu Glu Ser Asn He Val Ala Val Glu Arg Val Lys Glu Tyr Ser 

1250 1255 1260 

Lys Thr Glu Thr Glu Ala Pro Trp Val Val Glu Gly Ser Arg Pro Pro 
1265 1270 1275 1280 

Glu Gly Trp Pro Pro Arg Gly Glu Val Glu Phe Arg Asn Tyr Ser Val 

1285 1290 1295 

Arg Tyr Arg Pro Gly Leu Asp Leu Val Leu Arg Asp Leu Ser Leu His 

1300 1305 1310 

Val His Gly Gly Glu Lys Val Gly He Val Gly Arg Thr Gly Ala Gly 

1315 1320 1325 

Lys Ser Ser Met Thr Leu Cys Leu Phe Arg He Leu Glu Ala Ala Lys 

1330 1335 1340 

Gly Glu He Arg He Asp Gly Leu Asn Val Ala Asp He Gly Leu His 
1345 1350 1355 1360 

Asp Leu Arg Ser Gin Leu Thr He He Pro Gin Asp Pro He Leu Phe 

1365 1370 1375 

Ser Gly Thr Leu Arg Met Asn Leu Asp Pro Phe Gly Ser Tyr Ser Glu 

1380 1385 1390 

Glu Asp He Trp Trp Ala Leu Glu Leu Ser His Leu His Thr Phe Val 

1395 1400 1405 

Ser Ser Gin Pro Ala Gly Leu Asp Phe Gin Cys Ser Glu Gly Gly Glu 

1410 1415 1420 

Asn Leu Ser Val Gly Gin Arg Gin Leu Val Cys Leu Ala Arg Ala Leu 
1425 1430 1435 1440 

Leu Arg Lys Ser Arg He Leu Val Leu Asp Glu Ala Thr Ala Ala He 

1445 1450 1455 

Asp Leu Glu Thr Asp Asn Leu He Gin Ala Thr He Arg Thr Gin Phe 

1460 1465 1470 

Asp Thr Cys Thr Val Leu Thr He Ala His Arg Leu Asn Thr He Met 

1475 1480 1485 

Asp Tyr Thr Arg Val Leu Val Leu Asp Lys Gly Val Val Ala Glu Phe 
1490 1495 1500 
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Asp Ser Pro Ala Asn Leu lie Ala Ala Arg Gly lie Phe Tyr Gly Met 
1505 1510 1515 1520 

Ala Arg Asp Ala Gly Leu Ala 
1525 

<210> 7 
<211> 4509 
<212> DNA 

<213> Homo sapiens 
<400> 7 

atggccgcgc ctgctgagcc ctgcgcgggg cagggggtct ggaaccagac agagcctgaa 60 

cctgccgcca ccagcctgct gagcctgtgc ttcctgagaa cagcaggggt ctgggtaccc 120 

cccatgtacc tctgggtcct tggtcccatc tacctcctct tcatccacca ccatggccgg 180 

ggctacctcc ggatgtcccc actcttcaaa gccaagatgg tgcttggatt cgccctcata 240 

gtcctgtgta cctccagcgt ggctgtcgct ctttggaaaa tccaacaggg aacgcctgag 3 00 

gccccagaat tcctcattca tcctactgtg tggctcacca cgatgagctt cgcagtgttc 360 

ctgattcaca ccgagaggaa aaagggagtc cagtcatctg gagtgctgtt tggttactgg 420 

cttctctgct ttgtcttgcc agctaccaac gctgcccagc aggcctccgg agcgggcttc 480 

cagagcgacc ctgtccgcca cctgtccacc tacctatgcc tgtctctggt ggtggcacag 540 

tttgtgctgt cctgcctggc ggatcaaccc cccttcttcc ctgaagaccc ccagcagtct 600 

aacccctgtc cagagactgg ggcagccttc ccctccaaag ccacgttctg gtgggtttct 660 

ggcctggtct ggaggggata caggaggcca ctgagaccaa aagacctctg gtcgcttggg 720 

agagaaaact cctcagaaga acttgtttcc cggcttgaaa aggagtggat gaggaaccgc 780 

agtgcagccc ggaggcacaa caaggcaata gcatttaaaa ggaaaggcgg cagtggcatg 840 

aaggctccag agaccgagcc cttcctacgg caagaaggga gccagtggcg cccactgctg 900 

aaggccatct ggcaggtgtt ccattctacc ttcctcctgg ggaccctcag cctcatcatc 9 60 

agtgatgtct tcaggttcac tgtccccaag ctgctcagcc ttttcctgga gtttattggt 1020 

gatcccaagc ctccagcctg gaagggctac ctcctcgccg tgctgatgtt cctctcagcc 1080 

tgcctgcaaa cgctgtttga gcagcagaac atgtacaggc tcaaggtgcc gcagatgagg 1140 

ttgcggtcgg ccatcactgg cctggtgtac agaaaggtcc tggctctgtc cagcggctcc 1200 

agaaaggcca gtgcggtggg tgatgtggtc aatctggtgt ccgtggacgt gcagcggctg 12 60 

accgagagcg tcctctacct caacgggctg tggctgcctc tcgtctggat cgtggtctgc 1320 

ttcgtctatc tctggcagct cctggggccc tccgccctca ctgccatcgc tgtcttcctg 1380 

agcctcctcc ctctgaattt cttcatctcc aagaaaagga accaccatca ggaggagcaa 1440 

atgaggcaga aggactcacg ggcacggctc accagctcta tcctcaggaa ctcgaagacc 1500 

atcaagttcc atggctggga gggagccttt ctggacagag tcctgggcat ccgaggccag 15 60 

gagctgggcg ccttgcggac ctccggcctc ctcttctctg tgtcgctggt gtccttccaa 1620 

gtgtctacat ttctggtcgc actggtggtg tttgctgtcc acactctggt ggccgagaat 1680 

gctatgaatg cagagaaagc ctttgtgact ctcacagttc tcaacatcct caacaaggcc 1740 

caggctttcc tgcccttctc catccactcc ctcgtccagg cccgggtgtc ctttgaccgt 1800 

ctggtcacct tcctctgcct ggaagaagtt gaccctggtg tcgtagactc aagttcctct 1860 

ggaagcgctg ccgggaagga ttgcatcacc atacacagtg ccaccttcgc ctggtcccag 19 2 0 

gaaagccctc cctgcctcca cagaataaac ctcacggtgc cccagggctg tctgctggct 1980 

gttgtcggtc cagtgggggc agggaagtcc tccctgctgt ccgccctcct tggggagctg 2040 

tcaaaggtgg aggggttcgt gagcatcgag ggtgctgtgg cctacgtgcc ccaggaggcc 2100 

tgggtgcaga acacctctgt ggtagagaat gtgtgcttcg ggcaggagct ggacccaccc 2160 

tggctggaga gagtactaga agcctgtgcc ctgcagccag atgtggacag cttccctgag 2220 

ggaatccaca cttcaattgg ggagcagggc atgaatctct ccggaggcca gaagcagcgg 2280 

ctgagcctgg cccgggctgt atacagaaag gcagctgtgt acctgctgga tgaccccctg 2340 

gcggccctgg atgcccacgt tggccagcat gtcttcaacc aggtcattgg gcctggtggg 2400 

ctactccagg gaacaacacg gattctcgtg acgcacgcac tccacatcct gccccaggct 24 60 

gattggatca tagtgctggc aaatggggcc atcgcagaga tgggttccta ccaggagctt 2520 

ctgcagagga agggggccct cgtgtgtctt ctggatcaag ccagacagcc aggagataga 2580 

ggagaaggag aaacagaacc tgggaccagc accaaggacc ccagaggcac ctctgcaggc 2 640 

aggaggcccg agcttagacg cgagaggtcc atcaagtcag tccctgagaa ggaccgtacc 2700 

acttcagaag cccagacaga ggttcctctg gatgaccctg acagggcagg atggccagca 27 60 

ggaaaggaca gcatccaata cggcagggtg aaggccacag tgcacctggc ctacctgcgt 2820 

gccgtgggca cccccctctg cctctacgca ctcttcctct tcctctgcca gcaagtggcc 2880 

tccttctgcc ggggctactg gctgagcctg tgggcggacg accctgcagt aggtgggcag 2940 

cagacgcagg cagccctgcg tggcgggatc ttcgggctcc tcggctgtct ccaagccatt 3000 

gggctgtttg cctccatggc tgcggtgctc ctaggtgggg cccgggcatc caggttgctc 3060 

ttccagaggc tcctgtggga tgtggtgcga tctcccatca gcttctttga gcggacaccc 3120 

attggtcacc tgctaaaccg cttctccaag gagacagaca cggttgacgt ggacattcca 3180 

gacaaactcc ggtccctgct gatgtacgcc tttggactcc tggaggtcag cctggtggtg 3240 

gcagtggcta ccccactggc cactgtggcc atcctgccac tgtttctcct ctacgctggg 3300 
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tttcagagcc 
tcgtctgtct 
cgaacccagg 
agtt tcccgc 
ggcctggtgt 
ctcgtgggct 
cgcaactgga 
tggacgccca 
cagggcgggc 
gctgtgcagg 
accggggcag 
ggtgggatct 
aggatcagca 
gacctgctgc 
aaagccttgg 
gacctgagcg 
cagatcctca 
caggccatgc 
cgctccgtga 
ggcagcccgg 
ggcctggtc 

<210> 
<211> 
<212> 
<213> 



tgtatgtggt 
gctcccacat 
ccccctttgt 
gactggtggc 
ttgcagccgc 
tctctgtctc 
cagacctaga 
a ggaggctcc 
agatcgagtt 
gcgtgtcct t 
ggaagtcctc 
ggatcgacgg 
tcatccccca 
aggagcactc 
tggccagcct 
tgggccagaa 
tcctggacga 

tc gggagctg 

tggactgtgc 
cccagctgct 



tagctcatgc 
ggctgagacg 
ggctcagaac 
tgacaggtgg 
cacgtgtgct 
tgctgccctc 
gaacagcatc 
ctggaggctg 
ccgggacttt 
caagatccac 
cctggccagt 
ggtccccatt 
ggaccccatc 
ggacgaggct 
gcccggccag 
acagctcctg 
ggctactgct 
gtttgcacag 
ccgggttctg 
ggcccagaag 



cagctgagac 
ttccagggca 
aatgctcgcg 
cttgcggcca 
gtgctgagca 
caggtgaccc 
gtgtcagtgg 
cccacatgtg 
gggctaagat 
gcaggagaga 
gggctgctgc 
gcccacgtgg 
ctgttccctg 
atctgggcag 
ctgcagtaca 
tgtctggcac 
gccgtggacc 
tgcactgtgc 
gtcatggaca 
ggcctgtttt 



gcttggagtc 
gcacagtggt 
tagatgaaag 
atgtggagct 
aagcccacct 
agacactgca 
agcggatgca 
cagctcagcc 
gccgacctga 
aggtgggcat 
ggctccagga 
ggctgcacac 
gctctctgcg 
ccctggagac 
agtgtgctga 
gtgcccttct 
ctggcacgga 
tgcccattgc 
aggggcaggt 
acagactggc 



agccagctac 
ccgggcattc 
ccagaggatc 
cctggggaat 
cagtgctggc 
gtgggttgtt 
ggactatgcc 
cccctggcct 
gctcccgctg 
cgttggcagg 
ggcagctgag 
actgcgctcc 
gatgaacctc 
ggtgcagctc 
ccgaggcgag 
ccggaagacc 
gctgcagatg 
ccaccgcctg 
ggcagagagc 
ccaggagtca 
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Met 


Ala 


Ala 


Pro 


Ala 


Glu 


Pro 


Cys 


Ala 


Gly 


Gin 


Gly 


Val 


Trp 


Asn 


Gin 


1 








5 










10 






15 




Thr 


Glu 


Pro 


Glu 
20 


Pro 


Ala 


Ala 


Thr 


Ser 
25 


Leu 


Leu 


Ser 


Leu 


Cys 
30 


Phe 


Leu 


Arg 


Thr 


Ala 


Gly 


Val 


Trp 


Val 


Pro 


Pro 


Met 


Tyr 


Leu 


Trp 


Val 


Leu 


Gly 






35 










40 










45 






Pro 


lie 


Tyr 


Leu 


Leu 


Phe 


lie 


His 


His 


His 


Gly Arg 


Gly 


Tyr 


Leu 


Arg 




50 










55 










60 




Met 


Ser 


Pro 


Leu 


Phe 


Lys 


Ala 


Lys 


Met 


Val 


Leu 


Gly 


Phe 


Ala 


Leu 


He 


65 










70 










75 








80 


Val 


Leu 


Cys 


Thr 


Ser 


Ser 


Val 


Ala 


Val 


Ala 


Leu 


Trp 


Lys 


He 


Gin 


Gin 










85 










90 






95 




Gly 


Thr 


Pro 


Glu 


Ala 


Pro 


Glu 


Phe 


Leu 


He 


His 


Pro 


Thr 


Val 


Trp 


Leu 








100 










105 










110 




Thr 


Thr 


Met 


Ser 


Phe 


Ala 


Val 


Phe 


Leu 


He 


His 


Thr 


Glu 


Arg 


Lys 


Lys 






115 










120 










125 


Gly Val 


Gin 


Ser 


Ser 


Gly 


Val 


Leu 


Phe 


Gly 


Tyr 


Trp 


Leu 


Leu 


Cys 


Phe 




130 










135 








140 








Val 


Leu 


Pro 


Ala 


Thr 


Asn 


Ala 


Ala 


Gin 


Gin 


Ala 


Ser 


Gly 


Ala 


Gly 


Phe 


145 










150 










155 






160 


Gin 


Ser Asp 


Pro 


Val 


Arg 


His 


Leu 


Ser 


Thr 


Tyr 


Leu 


Cys 


Leu 


Ser 


Leu 


Val 








165 










170 








175 




Val 


Ala 


Gin 


Phe 


Val 


Leu 


Ser 


Cys 


Leu 


Ala 


Asp 


Gin 


Pro 


Pro 


Phe 


Phe 






180 










185 








190 






Pro 


Glu 


Asp 


Pro 


Gin 


Gin 


Ser 


Asn 


Pro 


Cys 


Pro 


Glu 


Thr 


Gly 


Ala 






195 










200 








205 






Ala 


Phe 


Pro 


Ser 


Lys 


Ala 


Thr 


Phe 


Trp 


Trp 


Val 


Ser 


Gly 


Leu 


Val 


Trp 




210 










215 










220 






Arg 


Gly 


Tyr 


Arg 


Arg 


Pro 


Leu 


Arg 


Pro 


Lys 


Asp 


Leu 


Trp 


Ser 


Leu 


Gly 


225 










230 










235 








240 


Arg 


Glu 


Asn 


Ser 


Ser 


Glu 


Glu 


Leu 


Val 


Ser 


Arg 


Leu 


Glu 


Lys 


Glu 


Trp 










245 










250 








255 


Met 


Arg Asn Arg 


Ser 


Ala 


Ala 


Arg 


Arg 


His 


Asn 


Lys 


Ala 


He 


Ala 


Phe 








260 










265 








270 






Lys 


Arg 


Lys 


Gly 


Gly 


Ser 


Gly 


Met 


Lys 


Ala 


Pro 


Glu 


Thr 


Glu 


Pro 


Phe 






275 










280 








285 








Leu 


Arg 


Gin 


Glu 


Gly 


Ser 


Gin 


Trp 


Arg 


Pro 


Leu 


Leu 


Lys 


Ala 


He 


Trp 




290 










295 










300 









3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4509 
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Gin Val Phe His 
305 

Ser Asp Val Phe 

Glu Phe lie Gly 
340 

Ala Val Leu Met 
355 

Gin Asn Met Tyr 
370 

lie Thr Gly Leu 
385 

Arg Lys Ala Ser 

Val Gin Arg Leu 
420 

Pro Leu Val Trp 
435 

Gly Pro Ser Ala 
450 

Leu Asn Phe Phe 
465 

Met Arg Gin Lys 

Asn Ser Lys Thr 
500 

Arg Val Leu Gly 
515 

Gly Leu Leu Phe 
530 

Leu Val Ala Leu 
545 

Ala Met Asn Ala 

Leu Asn Lys Ala 
580 

Gin Ala Arg Val 
595 

Glu Val Asp Pro 
610 

Gly Lys Asp Cys 
625 

Glu Ser Pro Pro 

Cys Leu Leu Ala 
660 

Leu Ser Ala Leu 
675 

lie Glu Gly Ala 
690 

Thr Ser Val Val 
705 

Trp Leu Glu Arg 

Ser Phe Pro Glu 
740 

Leu Ser Gly Gly 
755 

Arg Lys Ala Ala 
770 

Ala His Val Gly 
785 

Leu Leu Gin Gly 

Leu Pro Gin Ala 
820 



Ser' Thr Phe Leu 
310 

Arg Phe Thr Val 
325 

Asp Pro Lys Pro 

Phe Leu Ser Ala 
360 

Arg Leu Lys Val 
375 

Val Tyr Arg Lys 
390 

Ala Val Gly Asp 
405 

Thr Glu Ser Val 

lie Val Val Cys 
440 

Leu Thr Ala He 
455 

He Ser Lys Lys 
470 

Asp Ser Arg Ala 
485 

He Lys Phe His 

He Arg Gly Gin 
520 

Ser Val Ser Leu 
535 

Val Val Phe Ala 
550 

Glu Lys Ala Phe 
565 

Gin Ala Phe Leu 

Ser Phe Asp Arg 
600 

Gly Val Val Asp 
615 

He Thr He His 
630 

Cys Leu His Arg 
645 

Val Val Gly Pro 

Leu Gly Glu Leu 
680 

Val Ala Tyr Val 
695 

Glu Asn Val Cys 
710 

Val Leu Glu Ala 
725 

Gly He His Thr 

Gin Lys Gin Arg 
760 

Val Tyr Leu Leu 
775 

Gin His Val Phe 
790 

Thr Thr Arg He 
805 

Asp Trp He He 



Leu Gly Thr Leu 
315 

Pro Lys Leu Leu 
330 

Pro Ala Trp Lys 
345 

Cys Leu Gin Thr 

Pro Gin Met Arg 
380 

Val Leu Ala Leu 
395 

Val Val Asn Leu 
410 

Leu Tyr Leu Asn 
425 

Phe Val Tyr Leu 

Ala Val Phe Leu 
460 

Arg Asn His His 
475 

Arg Leu Thr Ser 
490 

Gly Trp Glu Gly 
505 

Glu Leu Gly Ala 

Val Ser Phe Gin 
540 

Val His Thr Leu 
555 

Val Thr Leu Thr 
570 

Pro Phe Ser He 
585 

Leu Val Thr Phe 

Ser Ser Ser Ser 
620 

Ser Ala Thr Phe 
635 

He Asn Leu Thr 
650 

Val Gly Ala Gly 
665 

Ser Lys Val Glu 

Pro Gin Glu Ala 
700 

Phe Gly Gin Glu 
715 

Cys Ala Leu Gin 
730 

Ser He Gly Glu 
745 

Leu Ser Leu Ala 

Asp Asp Pro Leu 
780 

Asn Gin Val He 
795 

Leu Val Thr His 
810 

Val Leu Ala Asn 
825 



Ser Leu He He 
320 

Ser Leu Phe Leu 
335 

Gly Tyr Leu Leu 
350 

Leu Phe Glu Gin 
365 

Leu Arg Ser Ala 

Ser Ser Gly Ser 
400 

Val Ser Val Asp 
415 

Gly Leu Trp Leu 
430 

Trp Gin Leu Leu 
445 

Ser Leu Leu Pro 

Gin Glu Glu Gin 
480 

Ser He Leu Arg 
495 

Ala Phe Leu Asp 
510 

Leu Arg Thr Ser 
525 

Val Ser Thr Phe 

Val Ala Glu Asn 
560 

Val Leu Asn He 
575 

His Ser Leu Val 
590 

Leu Cys Leu Glu 
605 

Gly Ser Ala Ala 

Ala Trp Ser Gin 
640 

Val Pro Gin Gly 
655 

Lys Ser Ser Leu 
670 

Gly Phe Val Ser 
685 

Trp Val Gin Asn 

Leu Asp Pro Pro 
720 

Pro Asp Val Asp 
735 

Gin Gly Met Asn 
750 

Arg Ala Val Tyr 
765 

Ala Ala Leu Asp 

Gly Pro Gly Gly 
800 

Ala Leu His He 
815 

Gly Ala He Ala 
830 
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Glu Met Gly Ser Tyr Gin Glu Leu Leu Gin Arg Lys Gly' Ala Leu Val 

835 840 845 

Cys Leu Leu Asp Gin Ala Arg Gin Pro Gly Asp Arg Gly Glu Gly Glu 

850 855 860 

Thr Glu Pro Gly Thr Ser Thr Lys Asp Pro Arg Gly Thr Ser Ala Gly 
865 870 875 880 

Arg Arg Pro Glu Leu Arg Arg Glu Arg Ser lie Lys Ser Val Pro Glu 

885 890 895 

Lys Asp Arg Thr Thr Ser Glu Ala Gin Thr Glu Val Pro Leu Asp Asp 

900 905 910 

Pro Asp Arg Ala Gly Trp Pro Ala Gly Lys Asp Ser lie Gin Tyr Gly 

915 920 925 

Arg Val Lys Ala Thr Val His Leu Ala Tyr Leu Arg Ala Val Gly Thr 

930 935 940 

Pro Leu Cys Leu Tyr Ala Leu Phe Leu Phe Leu Cys Gin Gin Val Ala 
945 950 955 960 

Ser Phe Cys Arg Gly Tyr Trp Leu Ser Leu Trp Ala Asp Asp Pro Ala 

965 970 975 

Val Gly Gly Gin Gin Thr Gin Ala Ala Leu Arg Gly Gly lie Phe Gly 

980 985 990 

Leu Leu Gly Cys Leu Gin Ala lie Gly Leu Phe Ala Ser Met Ala Ala 

995 1000 1005 

Val Leu Leu Gly Gly Ala Arg Ala Ser Arg Leu Leu Phe Gin Arg Leu 

1010 1015 1020 

Leu Trp Asp Val Val Arg Ser Pro lie Ser Phe Phe Glu Arg Thr Pro 
1025 1030 1035 1040 

lie Gly His Leu Leu Asn Arg Phe Ser Lys Glu Thr Asp Thr Val Asp 

1045 1050 1055 

Val Asp lie Pro Asp Lys Leu Arg Ser Leu Leu Met Tyr Ala Phe Gly 

1060 1065 1070 

Leu Leu Glu Val Ser Leu Val Val Ala Val Ala Thr Pro Leu Ala Thr 

1075 1080 1085 

Val Ala He Leu Pro Leu Phe Leu Leu Tyr Ala Gly Phe Gin Ser Leu 

1090 1095 1100 

Tyr Val Val Ser Ser Cys Gin Leu Arg Arg Leu Glu Ser Ala Ser Tyr 
1105 mo 1115 H20 

Ser Ser Val Cys Ser His Met Ala Glu Thr Phe Gin Gly Ser Thr Val 

1125 1130 H35 

Val Arg Ala Phe Arg Thr Gin Ala Pro Phe Val Ala Gin Asn Asn Ala 

1140 1145 H50 

Arg Val Asp Glu Ser Gin Arg He Ser Phe Pro Arg Leu Val Ala Asp 

1155 H60 1165 

Arg Trp Leu Ala Ala Asn Val Glu Leu Leu Gly Asn Gly Leu Val Phe 

1170 H75 1180 

Ala Ala Ala Thr Cys Ala Val Leu Ser Lys Ala His Leu Ser Ala Gly 
H85 H90 1195 1200 

Leu Val Gly Phe Ser Val Ser Ala Ala Leu Gin Val Thr Gin Ala Leu 

1205 1210 1215 

Gin Trp Val Val Arg Asn Trp Thr Asp Leu Glu Asn Ser He Val Ser 

1220 1225 1230 

Val Glu Arg Met Gin Asp Tyr Ala Trp Thr Pro Lys Glu Ala Pro Trp 

1235 1240 1245 

Arg Leu Pro Thr Cys Ala Ala Gin Pro Pro Trp Pro Gin Gly Gly Gin 

1250 1255 1260 

He Glu Phe Arg Asp Phe Gly Leu Arg Tyr Arg Pro Glu Leu Pro Leu 
126 5 1270 1275 1280 

Ala Val Gin Gly Val Ser Leu Lys He His Ala Gly Glu Lys Val Gly 

1285 1290 1295 

He Val Gly Arg Thr Gly Ala Gly Lys Ser Ser Leu Ala Ser Gly Leu 

1300 1305 1310 

Leu Arg Leu Gin Glu Ala Ala Glu Gly Gly He Trp He Asp Gly Val 

1315 1320 1325 

Pro He Ala His Val Gly Leu His Thr Leu Arg Ser Arg He Ser He 

1330 1335 1340 

He Pro Gin Asp Pro He Leu Phe Pro Gly Ser Leu Arg Met Asn Leu 
1345 1350 1355 1360 
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Asp Leu Leu Gin Glu His Ser Asp Glu Ala lie Trp Ala Ala Leu Glu 

1365 1370 1375 

Thr Val Gin Leu Lys Ala Leu Val Ala Ser Leu Pro Gly Gin Leu Gin 

1380 1385 1390 

Tyr Lys Cys Ala Asp Arg Gly Glu Asp Leu Ser Val Gly Gin Lys Gin 

1395 1400 1405 

Leu Leu Cys Leu Ala Arg Ala Leu Leu Arg Lys Thr Gin lie Leu lie 

1410 1415 1420 

Leu Asp Glu Ala Thr Ala Ala Val Asp Pro Gly Thr Glu Leu Gin Met 
1425 1430 1435 1440 

Gin Ala Met Leu Gly Ser Trp Phe Ala Gin Cys Thr Val Leu Leu lie 

1445 1450 1455 

Ala His Arg Leu Arg Ser Val Met Asp Cys Ala Arg Val Leu Val Met 

1460 1465 1470 

Asp Lys Gly Gin Val Ala Glu Ser Gly Ser Pro Ala Gin Leu Leu Ala 

1475 1480 1485 

Gin Lys Gly Leu Phe Tyr Arg Leu Ala Gin Glu Ser Gly Leu Val 
1490 1495 1500 

<210> 9 
<211> 18 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sec[LLencG source:/ note- M synthetic construct" 
<400> 9 

ctdgtdgcdg tdgtdggn 18 

<210> 10 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 10 

atggccgcgc ctgctgagc 19 

<210> 11 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 11 

gtctacgaca ccagggtcaa 20 

<210> 12 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 12 

ctgcctggaa gaagttgacc 20 

<210> 13 
<211> 20 
<212> DNA 
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<213> Artificial Sequence 



<220> 

<223> Sequence source : /note= " synthetic construct" 



<400> 13 
ctggaatgtc cacgtcaacc 



20 



w 



<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 14 

ggagacagac acggttgacg 2 0 

<210> 15 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= "synthetic construct" 
<400> 15 

gcagaccagg cctgactcc 19 

<210> 16 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 



<210> 17 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 17 

cgggatccag rgaraayath ctntttggn 29 

<210> 18 
<211> 29 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Sequence source : /note= " synthetic construct" 
<400> 18 

cggaattcnt crtchagnag rtadatrtc 29 



<400> 16 



rctnavngcn swnarnggnt crtc 
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